Regression. It's a word used often on this site. Regression usually is used in a negative connotation ala Jimmy Paredes' 2013 season and Chris Johnson's 2011 season. It is used heavily in the world of predictive stats. The particular predictive stat this article will analyze today is BABIP, its derivative xBABIP, and how regression for the 2014 Astros is going to be a good thing.
What is BABIP?
For those who have no idea what BABIP is, here is the word for word definition from the Fangraphs Glossary (I highly recommend reading this if you have never heard of this):
Batting Average on Balls In Play (BABIP) measures how many of a batter's balls in play go for hits.
For a little more visual explanation, check out the video below.
How do we use xBABIP?
It's actually quite simple. We can look at the discrepancies between xBABIP and BABIP. If xBABIP is higher than the actual BABIP of a player or team, we can expect positive regression. If xBABIP is lower than the player or team's actual BABIP, we can expect negative regression.
The Astros vs the Rest of the MLB
The Astros are in a league of their own when comparing BABIP to xBABIP. They are due some big time positive regression. The Astros xBABIP ranks 20th in the league. This offense has not produced the batted ball data to suggest that it is a top offense, but it is no where near as bad as it has been so far this season.
xBABIP and the Astros Players
The small sample size causes a few unreal xBABIPs to occur. LJ Hoes and Jesus Guzman are not going to produce BABIPs of over .400, so I've included career BABIP in this table also. It is more reliable than xBABIP, but it takes over 1,000 plate appearances for it to normalize. I also included career minor league BABIP for some of the younger players who have not accumulated many plate appearances at the major league level.
I added color to indicate who was due the most regression and what type. Red means a player is due a little bit of possible negative regression. Green indicates a player is due some positive regression.
Final Observations
Jason Castro is Going to Rake
Both xBABIP and BABIP suggest that Castro's current mendoza line batting average is not going to last. His current BABIP is .192 with his career BABIP and xBABIP both over .300.
Robbie Grossman will Straighten Things Out
I started this article before Robbie was sent down. He is due some major positive regression based on every type of predictive BABIP you can use. Now, his fielding on the other hand...
Jonathan Villar Could Be Pretty Special This Season
Villar has all the makings of a high BABIP player. He has the speed to rack up bunt hits and infield hits. If his BABIP normalizes and he continues his improved defense, we may be watching a 2 to 3 WAR player.
George Springer will Cause the Team BABIP to Increase, Dramatically
We have already witnessed how he will do this in his debut last night.
Springer's speed allows him to rack up infield hits like Villar. He carried a .380 career BABIP in the minor leagues. That is insanely high.
The Astros Will be Fun to Watch when this BABIP Normalizes
If the pitching staff can continue to pitch like they have, this may very well be a .500ish ball club. Balls will begin to find some holes and touch the outfield grass more often than they have been. It's just a matter of time.