clock menu more-arrow no yes mobile

Filed under:

How much is the bullpen improved?

Sabermetrics: Quantifying Potential Improvement in the Astros Bullpen

Kyle Terada-USA TODAY Sports

The Astros came into the off-season with a priority goal to improve the bullpen.  Depending on the metrics, the Astros' bullpen last season was either the worst or among the worst bullpens in the majors.  The Astros came away from the winter meetings with signed contracts for late inning relief pitchers Luke Gregerson and Pat Neshek.  They will join Chad Qualls as the late inning troika out of the bullpen.  Holdovers Josh Fields and Tony Sipp also show promise.  The Astros have a number of candidates for the remaining slots in the bullpen.

In light of these signings, I will attempt to quantify the potential improvement compared to the 2014 bullpen. The advanced metrics like WAR, SIERA, and FIP are not perfect tools for evaluating relief pitchers, but the metrics are better than traditional stats like ERA.  Relief pitchers face a variety of situations with varying leverage and numbers of batters faced; and different advanced metrics express contrasting capabilities to measure the different facets of relief pitching.

The primary benefit from signing Gregerson and Neshek is that they reduce the high leverage innings pitched by lesser relief pitchers in the bullpen.  In effect, the chain of relief pitchers is bumped down to lower leverage situations than they faced in 2014. WAR calculations attempt to account for this effect with a leverage adjustment.  In addition, the worst of the relief pitchers who pitched in 2014 are off the roster, and no longer scoop up any innings.

How I Did It

I assumed that Gregerson and Neshek will replace the following relief pitchers who together pitched 130 innings in the bullpen: Zeid, Bass, Williams, Farnsworth, and Clemens.  That's 28% of the total bullpen innings last season. That's an easy selection because we already know those pitchers will not pitch in the bullpen again, and they produced among the worst results for Houston.  There is no need for guesswork.  Strictly speaking, Gregerson and Neshek didn't cause these relievers to lose their jobs; they were already removed from the roster before the free agents were signed.  But we know that (assuming good health) Gregerson and Neshek should pitch about 130 innings between them.  And their presence means that two less bullpen slots need to be filled.  In addition, since we are measuring the improvement over last year's bullpen, it is appropriate to remove these pitchers' performance.

I also assume that Gregerson and Neshek pitch 65 innings each, which is conveniently close to their workload last year. For purposes of projecting Gregerson's performance, I considered calculating averages for the last three years, but as I examined that period, his performance last season was very close to his average for the period.  Again that's convenient because it allows me to base his performance on last season.  Neshek is more difficult to project, particularly given that he experienced some changes in both his pitch selection and usage last year.  However, I also believe it is appropriate to assume some regression from his excellent 2014 stats.  His BABIP was very low (.233); even though late inning relievers can sustain below average BABIP, the likely target is a bit higher than what he produced.  Therefore, I reduced his 2014 performance stats by 30% in a rough attempt at regressing his results.

fWAR and bWAR

Wins Above Replacement (WAR) is an appealing framework for quantifying the improvement in the Astros' bullpen.  WAR translates performance into wins, and wins are the ultimate quantification of improvement.  Fangraphs WAR (fWAR) and Baseball-Reference WAR (bWAR) are alternative methods for calculating WAR.  Both methods aim to do the same thing, but, for pitching, each attacks the problem from different angles. Both approaches attempt to isolate the pitcher's performance from the defense. But bWAR uses advanced fielding metrics to adjust the actual runs allowed, while fWAR is based only on results controlled by the pitcher (BB, K, HR).  According to Fangraphs, the Astros had the second worst bullpen at 0.5 fWAR. bWAR pegs the 2014 bullpen at -2.6.

Let's call Zeid, Bass, Williams, Farnsworth and Clemens the "replaced players." We can compare their cumulative WAR with total WAR for Gregerson and Neshek.  (A reminder that Neshek's WAR has been reduced by 30%.)

Gregerson 0.9
Neshek 1.26
total 2.16
Replaced Pitchers -2.4

Gregerson 1.7
Neshek 1.68
total 3.38
Replaced Pitchers -3.1

The next step is to deduct the replaced pitchers WAR from the 2014 bullpen WAR and add the Gregerson/Neshak WAR to the 2014 bullpen, producing the "new" bullpen WAR.  The difference between the actual and new bullpen WAR quantifies the improvement in wins.

actual new difference
bWAR 2014 astros bullpen -2.6 3.88 6.48
fWAR 2014 astros bullpen 0.5 5.06 4.56

Using bWAR, the improvement is 6.48 wins, and based on fWAR, the improvement is 4.56 wins.

RE24 Wins

RE24 is a win probability statistic based on the 24 base-out states.  An example of a base-out state is "bases loaded, one out." The number of base-out combinations is 24. The run expectancy for each of the base-out states is the average probability of runs scoring in that situation. RE24 measures whether the pitcher produced a change in run expectancy which is more or less than the average run expectancy for the particular base-out situation.  REW converts RE24 into wins.  The proponents of RE24 contend that RE24 is particularly useful for representing relief pitcher performance.  A relief pitcher may enter the game to face any one of the 24 base out situations, and RE24 measures his effectiveness based on the situation.

The Astros had the worst bullpen REW in the majors at -6.56 wins above average.  If you want a statistic that shows why you disliked the 2014 Astros' bullpen, look to RE24 and REW.  Only three of the 20 Astros' relief pitchers in 2014 had a positive RE24. Neshek and Gregerson would have been the two highest REW pitchers in the Astros' bullpen.  Unlike bWAR and fWAR, REW is not independent of the team defense.  In addition, because RE24 is affected by random volatility due to to BABIP and sequencing of hits, RE24 has a relatively low correlation with the succeeding year RE24.  However, weak correlation does not mean that correlation is non-existent.  And some relievers maintain consistent year-to-year positive RE24.

In order to make the REW results comparable to fWAR and bWAR, runs above average must be converted to runs above replacement.  I used the difference (based on Baseball-Reference) between wins above average and wins above replacement per inning for the Astros' pitching staff to convert REW to a WAR equivalent.  The table below deducts the converted REW for replaced pitchers and adds converted WAR for Gregerson and Neshek in order to arrive at a new WAR for the bullpen based on REW.


Gregerson 0.89

Neshek 1.73

subtotal 2.62

convert to WAR 3.78

Replaced Pitchers -4.88

convert to WAR -3.72

2014 astros bullpen REW -6.53 New difference
convert to WAR -3.03 5.63 8.66

Based on RE24, the improvement in the bullpen is significantly higher than what we found with bWAR and fWAR.  Almost 9 wins improvement would be quite impressive.  If the Astros had achieved the REW result projected above, the team would have had the 10th highest bullpen REW. Note that the numbers in the table, above,  have to be converted back to wins above average for comparison to the fangraphs REW team leaderboard.


SIERA is a rate stat which is considered to be more predictive in nature.  SIERA has a relatively high correlation to succeeding year ERA. SIERA is a formula which takes into account some batted ball characteristics as well as strike outs and walks.

The Astros had the 7th worst bullpen (3.58), according to SIERA.  The SIERA for Gregerson and Neshak is 2.95 and 2.55, respectively. In the table below, the replaced players' contribution to the total bullpen SIERA is removed and replaced with the Gregerson and Neshek SIERA.

2014 Astros Bullpen

SIERA  3.58

New SIERA  3.16

The new bullpen SIERA, after the change, would rank the Astros bullpen as the 8th best bullpen SIERA.  Moreover, given that Gregerson's and Neshek's SIERA is less than their respective FIP (used to produce fWAR), the SIERA improvement suggests that the fWAR wins are a conservative estimate of bullpen improvement.


The calculations point to a bullpen WAR improvement of 4.5 - 8.6 wins.  Is that pretty good?  Heck, yes.

For those who want to see more numbers, data for the SIERA calculation are shown below.





Replaced Players

replaced difference
2014 astros bullpen SIERA
3.58 3.16 -0.4172