clock menu more-arrow no yes mobile

Filed under:

Sabermetrics: Astros Hitters' Luck

Talking Astros Sabermetrics: What can BABIP tells up about Astros hitters' luck?

Scott Halleran

Last week we examined the Batting Average on Balls in Play (BABIP) of Astros' starting pitchers. Now we will evaluate the BABIP of Astros' hitters. In both instances, we use BABIP to provide us information regarding the extent that a players results-oriented stats this year (like ERA and Batting Average) reflect good or bad "luck" on batted balls which turn into hits.

Evaluating hitters' BABIP is not as straightforward as pitchers' BABIP. According to accepted sabermetric principles, batters have more control over BABIP than pitchers. We assume that pitchers' BABIP will regress to an average BABIP which falls witinh a relatively narrow range around the league average. The range of hitters' expected BABIP is much larger because the BABIP will be driven by the diverse range of hitting skills at the major league level.

Like the evaluation of pitchers in last week's sabermetric article, we can estimate an "expected BABIP" (x-BABIP) for each hitter's performance so far this season. The x-BABIP method relies upon batted ball data to provide a benchmark for evaluating whether the batters' hits/outs on balls in play appear to be "lucky" or "unlucky." The x-BABIP benchmark leads to conjecture of an under- or over- performance during the period in question. An over-performance can be a red flag that the players' batting may regress downward in the future. Conversely, an under-performance tells us, at the least, that the hitter has a good chance for improving his batting average with normal regression to mean.

Hitters' BABIP can be extremely volatile from month-to-month and year-to-year. For that reason, x-BABIP is helpful in giving us an indication whether a hitters' slumping or skyrocketing batting stats are "real" or not. Obviously the stats are real in that they happened and helped or hurt the team win games. But BABIP "luck" can result in a mirage, leading us to believe that a hitter is better or worse than his actual skill level.

The early major league experience of Chris Johnson and Jimmy Paredes provide Astros' fans with examples of putting too must trust in a player's BABIP. In Johnson's first extended call up, 2010, he hit for a .308 batting average and a 119 wRC+, leading General Manager Ed Wade to pencil him in as a future franchise player for the Astros.

Oh, but wait, Chris Johnson's BABIP was .387 in 2010---we can't expect that performance to be the norm.

In the next year, Johnson's BABIP declined 70 points, and his batting average fell to .251 with an 81 wRC+. Ed Wade was disappointed, but should we have been surprised? (No.) You can make the same comparisons of Paredes' 2011 (.383 BABIP) and 2012 (.255 BABIP) seasons. These were the same players, with the same skills, in both seasons. The x-BABIP concept assists us in separating the real from the mirage.

For estimating x-BABIP I have used the most recent modification of the formula used by Jeff Zimmerman (with the help of Robert Boden) at Fangraphs. This formula uses groundballs, fly balls, infield hits, infield pop ups, line drives, bunt hits, and home runs as data inputs for estimating a player's x-BABIP. I rely upon the average weightings applicable to those inputs for the period 2009-2012 to estimate x-BABIP for the 2013 performance. Because the selection of a x-BABIP formula is somewhat subjective, I have also used the older Hardball Times x-BABIP formula, which followed up on the ground breaking HT article on BABIP by Chris Dutton and Peter Bendix. Utilizing both formulae provides us a range of expectations.

In the table below, the Astros hitters' BABIP so far this season is compared to the x-BABIP for the same period. The fangraphs formula is x-BABIP(1) and the Hardball Times' formula is x-BABIP(2). The difference between BABIP and x-BABIP is shown as a negative percentage for an under performance (x-BABIP higher than BABIP) and a positive percentage for an over performance (x-BABIP is lower than BABIP). Note that I have excluded hitters who have a small number of at bats (such as Paredes and Maxwell).





Over / (Under)

Over / (Under)


BABIP

x-BABIP(1)

x-BABIP(2)

Performance(1)

Performance(2)

Altuve

0.326

0.342

0.338

-4.7%

-3.6%

Dominguez

0.245

0.274

0.295

-11.7%

-20.2%

Castro

0.333

0.348

0.336

-4.3%

-0.8%

Barnes

0.364

0.348

0.328

4.5%

10.0%

Pena

0.285

0.318

0.338

-11.8%

-18.7%

Carter

0.326

0.307

0.313

5.7%

3.9%

Martinez

0.313

0.336

0.330

-7.0%

-5.3%

Gonzalez

0.261

0.304

0.319

-16.4%

-22.3%

Cedeno

0.329

0.352

0.317

-6.9%

3.7%

Corporan

0.379

0.376

0.349

0.9%

8.0%

Notes: (1)= Fangraphs x-BABIP (2)= Hardball Times x-BABIP

Notice that the two x-BABIP formulae show the same direction of performance (i.e., under or over performance) in each case except for Ronny Cedeno. Cedeno under performed x-BABIP, according the fangraphs formula, but over performed x-BABIP based on the HT formula. Excluding Cedeno, six of the remaining nine Astros' hitters under performed x-BABIP. Corporan, Carter, and Barnes have over performed x-BABIP so far this year.

Dominguez, Pena, and Gonzalez are the biggest under performers of x-BABIP. Gonzalez has suffered through a miserable offensive season, with an OPS below .600. So, it's not surprising that he has been a major under performer of x-BABIP.

Pena has been somewhat productive, but his OBP and power has been lower than expected. Both BABIP models estimate much higher x-BABIP (.318 - .338) than his .285 actual BABIP. This may provide some hope that Pena's batting will improve in the future. However, Pena's vulnerability to defensive shifts may be one of the reasons for under performing x-BABIP. Defensive shifts typically reduce a player's BABIP by 0.013 points. In Pena's case, the x-BABIP differential is more than three times the average effect of defenive shifts---thus, bad luck remains a possible explanation for his BABIP under performance.

Dominguez has been productive on occasion at the plate, but his overall offensive performance has been poor--in part due to a very low BABIP. Although Dominguez has other parts of his offensive game which need work, such as his walk rate, there is a lot of room to increase his BABIP and thereby improve his batting average in the future. Perhaps an improvement in Dominguez's BABIP will allow him to push his OPS over the .700 threshold.

J.D. Martinez has experienced a moderate-to-high level of under performance. Martinez's current wRC+ (83) is disappointing, and any improvement in his BABIP would be welcome. A .325 BABIP supported Martinez's notable production (wRC+ 103) when he was first called up in 2011. His current x-BABIP is higher than his BABIP during that 2011 ML campaign; perhaps this is an encouraging sign for an upturn in his offense.

Barnes' has the highest over performance and largest potential for downward regression in BABIP. This isn't surprising; his BABIP has been regressing ever since he posted an unsustainable .483 BABIP in March/April. His monthly OPS has dipped below the .600 mark as his BABIP has regressed. Barnes' x-BABIP indicates that he can sustain a relatively high BABIP, but his current 2013 BABIP remains higher than even the high x-BABIP level.

Although Corporan has a high BABIP, his x-BABIP is also surprisingly high. The potential for regression based purely on BABIP is fairly moderate. x-BABIP is not a prediction of future performance, but instead focuses on whether the BABIP during a past time period is abnormal based on the batted ball distribution. Thus, x-BABIP is not a prediction of the player's true talent level. In the case of Corporan, his x-BABIP is quite high (over .346), but this is largely based on a high line drive rate (28%). Corporan's hitting skill may not sustain a line drive percentage that high, and if that's true, his performance may regress more than x-BABIP would indicate.

Carter has moderately over performed his x-BABIP. The extent of over performance is small enough that normally it might not be viewed as a concern. However, Carter's .231 batting average is already relatively low, and any additional decline in his BABIP might be a concern. A major reason for the apparent over performance is an above average .794 BABIP on line drives. Typically line drive BABIP will regress toward the low .700's. However, there is evidence that power hitters can sustain somewhat higher BABIP on line drives, probably because their liners are hit harder. And Carter does hit the ball hard. The fact that Carter also had a .833 BABIP on line drives in 2012 may indicate that he can normally sustain above average batting averages on liners. At this point, we can't reach any firm conclusions.

Any surprises here?