One of the numerous question marks for the Astros' offense going into 2011 is whether Carlos Lee will rebound to an offensive level closer to his career marks. GM Ed Wade has indicated that he is depending on Lee returning to his former performance. That is one reason that a offhand remark by Dave Cameron in an unrelated fangraphs article attracted my attention.
In a discussion of Alcides Escobar, who was part of the Greinke trade, Cameron pointed to Escobar's weak line drive batting average in 2010 as an indication of bad luck:
The main cause of Escobar’s pitiful 2010 slash line? His .613 batting average on line drives, second worst in baseball among full time players – only Carlos Lee (.612) had a worse outcome on line drives. Of the 93 balls he hit hard enough to be judged liners, he only ended up with 57 hits. The league average is around .725 in most years, and the year-to-year correlation in BA on line drives is a minuscule .015, as the results appear to be mostly random.
Did you see what caught my eye? The mention of Carlos Lee. And I verified that Carlos Lee had the worst batting average (among qualifying players) on line drives in 2010. If Cameron's thesis is true that line drive batting average is relatively random, then this provides us a nice dose of optimism for improved hitting by Carlos in 2011.
As much as I want to be optimistic, Cameron's use of line drive batting average as a random result was new to me. Although it makes some sense, I couldn't recall any other studies which reached that conclusion. Also, given that there may be some subjectivity in distinguishing line drives from fly balls, one can imagine the possibility that some players are hitting mostly borderline line drives; and that's not even mentioning the possibility that stringers' bias could affect the classification of line drives and fly balls at various ballparks. Therefore, I decided to undertake a simple test to see if above or below average line drive batting average tends to regress toward the major league mean.
I picked 2007 out of the air, and utilized the 15 highest batting averages on line drives in that year. (Edward Encarnacion was No. 1 with a .859 batting average, and Albert Pujols was No. 15 with a .821 batting average.) All but one of the those 15 hitters had a lower batting average on line drives the next year, 2008. On average, the 15 hitters had a reduction in line drive batting average over 100 points.
I also examined the 5 lowest batting averages on line drives in 2007, and made a similar comparison for those players in 2008. (The five players were Theriot, Lugo, Kendall, A. Jones, and D. Roberts.) All but one (Kendall) of the five showed an increase in batting average on line drives in the following year (2008). The average increase in line drive batting average was 124 points.
I've summarized the 2007/2008 averages below:
Top 15 LD Batting Average, 2007 .836
Next year LD Batting Average .734
Bottom 5 LD Batting Average, 2007 .612
Next year LD Batting Average .742
All players LD Batting Average, 2008 .728
For those of you who wonder what we mean by player regression, the comparisons above are a good example. The highest and lowest line drive batting averages in 2007 move close to the overall average line drive batting average in the next year. This little comparison supports Cameron's supposition that line drive batting average is more or less random. In case you are wondering, the Carlos Lee equivalents in 2007, Dave Roberts and Andruw Jones, increased their line drive batting averages into the .800's in 2008.
Like those two unlucky hitters in 2007, when Lee hit the ball hard in 2010, approximately 15% more than average were caught. Over his career, Lee's line drive batting average is .726, which is close to average. And an examination of Lee's year-to-year line drive batting averages indicates an up and down pattern consistent with random variation.
Chris Johnson is the opposite of Lee with respect to line drive batting average--his .803 was among the highest in the baseball in 2010. And, if line drive batting average is random, then this confirms our suspicion that Johnson is likely to sustain a significant decline in his batting average in 2011.
This comparison doesn't necessarily tell us what to expect in terms of future power from Carlos Lee. In 2010, Lee's slugging and isolated power on line drives was substantially below his average line drive power measures. It's not really obvious whether there is a declining power trend for Lee's line drives. And, the extent that slugging/ISO on line drives are random from year to year is not clear. If a ball is hit hard, there probably is a significant element of luck in determining whether it turns into extra bases. (On the other hand, it's possible that declining foot speed is a contributing factor too.) One reason for the low SLG and ISO on line drives is that Lee had no HRs off line drives in 2010. Lee has 26 HRs off line drives over his career, and the year to year pattern looks relatively random. In 2007, Lee had zero line drive HRs; so, 2010 isn't isolated.
This analysis also piqued my interest in examining other batting averages by batted ball types for Astros' hitters. I plan on writing more on this subject next week.