clock menu more-arrow no yes mobile

Filed under:

Sabermetrics: BABIP and Astros' Pitchers

Talking Astros Sabermetrics: BABIP and x-BABIP Tells Us A Lot About the Astros Rotation's Improvement


"The only sure thing about luck is that it will change."

Colloquially speaking, the Astros' starting pitching has gone from the outhouse to the penthouse over the course of this season. The Astros' starting pitching was disastrous early this year, but so far the Astros' rotation has posted the third best ERA in the majors for June. Let's turn to sabermetrics to ask, why?

Sabermetrics keeps coming back to our old friend, BABIP. Batting Average on Balls in Play or BABIP (defined here) can be used to evaluate whether a player's "results" over a particular period reflect over- or under- performance. This is true for both pitchers and hitters (but the evaluation is more complicated in the case of hitters). BABIP is a statistic that takes so long to stabilize that most seasonal and even multi-season BABIP results are "small sample sizes." Random variation is a huge factor affecting the distribution of hits and caught balls over discrete time periods. And, it follows that the variation in BABIP has a significant impact on results-oriented statistics like ERA.

Because pitchers have relatively limited control over BABIP, we expect a pitcher's high or low BABIP to revert to the mean over some extended period of time. What is the benchmark as to whether a pitcher's BABIP is high or low? Sometimes analysts compare a pitcher's BABIP to the league average BABIP. But this may lack some precision. The expected mean BABIP is dependent to some extend on the types of batted balls allowed by the pitcher. For example, fly balls are less likely to fall in for hits than groundballs or line drives.

Last year, I discussed a formula for Expected BABIP, or x-BABIP, for pitchers. This formula gives us a benchmark for evaluating whether a pitcher's ERA reflects under or over performance in terms of converting batted balls into outs. The formula was derived by Steve Staude at Fangraphs. The primary inputs to the formula are line drive rate, fly ball rate, and infield flyball rate. (You can refer back to my previous article or Staude's earlier piece for details on the formula.) When I calculate the formula based upon AL average batted ball data so far ithis season, the formula produces exactly the average BABIP (.297) for American League starting pitchers. I take that as a good sign.

The Astros starting pitching currently carries the second highest BABIP in the majors. And the Astros starters' BABIP is significantly higher than the rotation's x-BABIP, as shown below. In order to evaluate whether this is caused by the early season failures of Humber and Peacock, both of whom were removed from the rotation in May, I have also shown the Astros BABIP and x-BABIP excluding those two pitchers.

Astros Rotation

2013 BABIP vs. x-BABIP

Including Humber and Peacock

BABIP .314

x-BABIP .297

Difference: .0168

Excluding Humber and Peacock

BABIP .310

x-BABIP .296

Difference: .0144

Clearly, Humber and Peacock contributed to the high BABIP, but they weren't the main cause of the high BABIP. The difference between the actual BABIP and the expected BABIP is very similar, whether Humber's and Peacock's results are included or not. Part of the reason for the apparent under-performance by the rotation lies with the defense. Given that the Astros' defense is ranked 20th or 29th, based on DRS and UZR, respectively, we can surmise that the Astros' fielders didn't give a lot of help to the rotation. However, the size of the difference between the actual and expected results seems too large to be caused soley by defense. This would suggest that randomness or "luck" is a contributor to the difference, which means that the rotation's 2013 results may improve over the course of the season.

In fact, the rotation's BABIP has been regressing each month this season. The monthly BABIP and x-BABIP comparison is shown below.

x-BABIP BABIP Difference
Mar./Ap. 0.300 0.345 0.045
May 0.303 0.322 0.019
June 0.284 0.265 -0.019

And, by the way, the extremely high BABIP in March/April would not decrease if the performances of Peacock and Humber were excluded. After the horrible BABIP results in March and April, the BABIP declined in both May and June. The under-performance in March - May has been partially offset so far by an apparent over performance in June. The hit suppression in June may not be completely sustainable, but we wouldn't expect a return to the high BABIP totals experienced in March - May.

The monthly regression is not confined to the actual BABIP, since the expected BABIP has consistently declined. This reflects the changing mix of batted ball types, principally line drives. The starters' line drives have decreased from 22% to 20% to 18% over the three time periods. Line drive rates normally are subject to significant fluctuation and regression. However, some of the regression in line drives could reflect sharper control, pitch sequencing, and other pitching reflinements as the year has proceeded.

The BABIP and expected BABIP for the current starting pitchers is shown below.

x-BABIP BABIP Difference
Keuchel 0.297 0.304 0.007
Harrell 0.301 0.300 -0.001
Lyles 0.304 0.297 -0.007
Norris 0.299 0.324 0.025
Bedard 0.265 0.317 0.052

Three of the five pitchers--Keuchel, Norris, and Bedard--have experienced higher than expected BABIP. Lyles and Harrell allowed BABIPs slightly less than expected. Norris and Bedard have allowed BABIP which is much higher than expected (25 - 52 points above expectations).

Norris and Bedard are distinguishable from the other pitchers because they are fly ball pitchers who use the 4 seam fastball as a bread and butter pitch. They may be hurt more by the defense than Harrell, Lyles, and Keuchel. According to DRS, the Astros' infield is +5 runs saved, but the outfield is -12 runs. This may be a partial explanation for the high BABIPs rung up by Norris and Bedard. On the other side, the below x-BABIP rates of Lyles and Harrell may reflect the benefit they receive from the infield defense and shifts.

The x-BABIP evaluation suggests a potential beneficial reversion for Norris' and Bedard's hit rates in the future. Harrell and Lyles could experience a higher BABIP in the future, but I don't see a worrisome red flag here because their x-BABIP differential is relatively small. Similarly, Keuchel's current BABIP appears to be sustainable and perhaps may even decline a bit in the future.

Ballpark configurations and characteristics can affect BABIP. The final comparison, therefore, will examine home and road splits in Astros pitcher BABIP. As shown, below, the Astros' pitcher BABIP at Minute Maid Park is much higher than on the road.

x-BABIP BABIP Difference
home 0.299 0.320 0.021
away 0.292 0.298 0.006

The pitchers' actual BABIP is higher than expected for both home and road---but the home BABIP is much higher than expected. Perhaps the vast expanse of MMP's center field allows more fly balls to fall in for hits. Or maybe it's just a small sample size phenomona. The BABIP split has led to a much higher Astros ERA at home than on the road (5.04 vs. 4.28). Bud Norris, the king of pitching at home, exhibits similar BABIP rates at home and on the road. But it's interesting--maybe even amazing--that Norris can maintain a 2.67 ERA at home while simueltaneously allowing a .323 BABIP.

BABIP isn't the only characteristic that affects pitching analysis. DIPS posits that K rate and BB rate--and HR rates to a lesser extent--are the most important pitching skill factors. But this analysis focuses on the pitching characteristic that is least correlated with pitching skill and most likely subject to statistical regression. BABIP regression isn't the only reason for the turnaround in the rotation's results, but it is a major contributing factor.

Next week I will tackle x-BABIP and the sustainability of Astros' hitters' BABIP.