Astros Sabermetrics Grab Bag

Quick sabermetric notes on prospect projections, "the trade," and RE24 for Altuve and Dominguez

Looking for a nice long sabermetric study today?  Sorry, not today.  But maybe three notes on unrelated topics which involve both sabermetrics and the Astros will provide food for thought.  Sometimes the casual sabermetric observations are the most interesting.  Without infringing on David's "Three Astros' Things" label, here are three sabermetric things.


Subber10 wrote about the minor league projection system called "Gapper."  Let's take a look at Chris St. John's recent version of his minor league projection model called "Javier," which is set out in three articles at Beyond the Boxscore. An accurate minor league projection system is the holy grail of minor league sabermetrics.  We're a long way from systems which are foolproof.  But each of the efforts at producing a projection system assists analysts in understanding the factors which are important in calculating the odds of minor league players' success in the majors.

A key difference between Gapper and Javier is that Gapper produces accumulated points which can be used to rank players, while Javier attempts to project the probability of a prospect's path to the majors resulting in "productive ML player," "average ML player," and "bust."  Also, Javier only produce results for position players, not pitchers

Previously St. John has written about the influence of walk rates and strike out rates on prospect success.  However, in his own words: "But a trend became apparent through the analysis: slugging prospects were severely underrated, including the system's namesake Javier Baez."   As a result, the new version of the system includes isolated power (ISO) as a significant input, along with walk rate and strike out rate.  In addition, age and level for the minor league statistics are incorporated into the projection.  For the statistically minded, z-scores for the BB%, K%, and ISO% are developed by the system.

The introductory article includes the minor league data through June 2014 for St. John's top prospect list.  So, let's go to the 2014 results for Astros' prospects:

Produtive% Average% Bust% Javier Guess
Jonathan Singleton
43.30% 30.00% 26.70% Productive
Carlos Correa
37.70% 11.50% 50.80% Productive
Ronald Torreyes
34.80% 21.70% 43.50% Productive
Domingo Santana
22.30% 23.40% 54.30% Productive
George Springer
21.00% 15.30% 63.70% Productive
Preston Tucker
19.00% 19.00% 62.10% Productive
Brett Phillips
18.60% 18.60% 62.70% Productive
Rio Ruiz
16.80% 14.70% 68.50% Productive
Teoscar Hernandez
12.20% 8.30% 79.50% Productive
Andrew Aplin
12.20% 12.80% 75.00% Productive
M.P. Cokinos
11.40% 20.20% 68.40% Bust
Japhet Amador
11.40% 27.70% 61.00% Bust
Delino DeShields
11.20% 27.40% 61.40% Average
Carlos Perez
8.90% 10.30% 80.70% Average
Max Stassi
6.30% 29.90% 63.70% Average
Danry Vasquez
5.50% 15.70% 78.70% Average
Jonathan Meyer
4.80% 13.70% 81.50% Average
Tyler Heineman
2.80% 9.20% 88.10% Average
Jonathan Meyer
1.00% 3.80% 95.10% Average

For comparison here is Gapper's results for Astros' prospects.

Interestingly, Jonathon Singleton has the second best profile among all 2014 minor leaguers, just below the Red Sox's Mookie Betts and just above the Dodgers' Joc Pederson.  Singleton's high probability of becoming a productive major leaguer has to be a good sign for Astros' fans, right?  Despite a rocky start in the majors, hopefully Singleton fulfills the promise of the Javier projection. It's also interesting that Torreyes grades so well on both the Javier and Gapper system.  Keep in mind that both Torreyes and Singleton were young for AAA.   Any surprises for you?

"The Trade"

Given the timing of this article, the Astros' trade with the Marlins is a topic occupying a lot of space at TCB.  Let me just focus on one point: Bo Porter's quote (as relayed in J.J. Ortiz's tweet).  "Once those guys went down internally that was something we discussed internally," Porter said of the need to get a CF."   Initially the Astros were believed to be searching for a major league bat---but I suspect those were simply unavailable on the trade market.  The next question would be, "what else can we do to improve the ML team."  That leads to defense.

Next, let's examine what input the advanced metrics might add to the question.  Defensive Runs Saved (DRS) is broken down on a team basis at Bill James online.  The Astros are -21, on a team basis, for defense.  Of this below average fielding, CF is -16, or 76% of the team's fielding deficiency.  If you believe (like I do) that advanced metrics underrate the Astros' infield defense, because of the Astros infield shifts, the CF fielding may account an even larger share of the problem. Coincidently, the Astros lead the majors in team DRS for shift plays (+16), which is exactly erased by the CF fielding deficiency---at least if you accept DRS' numbers.   So, if you are asked, "where can we improve team defense?" DRS obviously points to CF.

GDPs, RE 24, and Altuve and Dominguez

I previously have written about the stat called RE24.  Here and Here.  In short, RE24 measures the extent that the player changes the run expectancy of the given 24 Base-Out situations he faces as a hitter, This is a cumulative stat for all of the player's plate appearances.  Zero means the player produced an average result for all of the base-out situations he faced, and positive means a higher than average outcome, and negative, worse than average, for all of the situations.  Unlike other advanced offensive metrics, in addition to the normal offensive events like walks and hits, a hitter receives credit in the form of run expectancy for moving the runner over with less than two out, or hitting a sacrifice fly, as well as negative impact from grounding into double plays.  Thus, RE24 is more dependent on the context of each action by the hitter.

Fangraphs recently updated their library's explanation of RE24, if you want to check it out.

Let's look at two Astros who are under performing on RE24, Jose Altuve and Matt Dominguez. For hitters, I compare RE24 to the player's wRAA (runs above average).  Runs above average uses linear weights to calculate the average run value of hitting events (1b, HR, BB, etc.) but without consideration of base-out states.

Difference: RE24 and wRAA

(Negative = wRAA is higher)

Altuve  -9.256

Dominguez -7.5

What does this mean? Because of his performance results in various base-out states, Altuve is 9.3 runs (approx. 1 win) worse than the linear weights metrics (which includes wOBA, wRC)  tell us.  For Dominguez, it's 7.5 runs worse, or close to 1 win worse.  This means the average hitter would have contributed that many more runs than those two players, if confronted with the same opportunities.

Why?  Let's begin with the fact that Dominguez and Altuve lead the Astros in GIDP, with 16 and 12 respectively.  It's true that other factors, like their clutch hitting and base running, as well as plain old luck, affect RE24.  But GIDPs are highly damaging to run expectancy, and the double plays they hit into likely are major contributors to the negative difference between RE24 and wRAA.  Sure, there are high-GIDP sluggers who hit a bunch of HRs with runners on base and still post a RE24 higher than their wRAA.  But Altuve hits few HRs, and Dominguez has low overall power (ISO of 133), despite hitting 13 HRs.  The fact that neither player draws many walks with runners on base doesn't help.

Sometimes people assume that a propensity to hit into double plays is not that big a deal, because GIDPs seem like a relatively small number of plays.  But the run expectancy damage can be quite high, depending in part on the number of runners on base and whether it is 0 or 1 out.

Just so I don't end on a negative note about Altuve, who is still having a very good season, let me link you to an article (non-pay wall) at Bill James On-line which shows Altuve as the 4th best baserunner in baseball (tied with Hunter Pence).  Since Altuve is a good base stealer, which is picked up in RE24, it is more surprising that Altuve's RE24 isn't at least equal to his wRAA.  Perhaps it's just random luck on the timing of some past batting situations.

Among Astros' starting position players, Jason Castro, Dexter Fowler, George Springer, and Robbie Grossman have RE24 results better than their wRAA.

(Note: The RE24 stats are as of Aug. 1; Altuve's and Dominguez's differential compared to wRAA improved slightly after games over the weekend.)