The two most-often uttered phrases around The Crawfish Boxes lately have included the words "regression" or "small sample size!" Given the Astros hitters' odoriferous April, arguments using those concepts have poured forth from statistically-minded commentators, much to the dismay of those who prefer to eschew such things. In fact, multiple TCB readers (and you know who you are!) have accused us writers and others of hiding behind sabermetric concepts to defend the indefensible - the Astros' sub-mendoza-line batting average and high strikeout rates, namely. Some are tired of hearing about it.
Please don't stop reading here. This article is all about providing context to the discussion. Nobody argues that the Astros haven't been terrible at hitting baseballs in 2014 thus far. Rather, the debate seems centered on what the Astros will do moving forward. Will "Kris Karter" continue to strike out at a 40% clip? Have the Astros assembled a group of hitters that are historically inept at reaching base via contact? Maybe, claim some. Maybe not, say others.
Small Sample Sizes
As an impatient society, we dislike waiting for results. Good things come to those who wait, our grandparents said. To heck with that! Give me what I want now, or else go away. In this era of instantaneous 140-character communication, many of us are conditioned to think this way whether we realize it or not. Thus, any argument that states, "small data samples can be ignored as inconclusive because there hasn't been enough time for stats to stabilize" is a distasteful one to many, whether valid or not.
But in baseball, sample size arguments are incredibly valid when trying to project future performance.
One thing that has become evident to me during recent conversations in the message board is that a vast number of fans seem to unconsciously expect that a .300 hitter should be a .300 hitter all season. Everybody knows to ignore Game One, where a .300 hitter might go 1 for 4, because, well, it's just the first game, right? But by week three, how come that guy is only hitting .200? What a bum!
300 hitters don't hit .300 every week, or even every month of the season. They'll have .250 months followed by .350 months. That's why 162 games are needed to separate the wheat from the chaff. Sample sizes of even a month, or about 100 Plate Appearances, are nearly meaningless in the grand scheme. In the sport of player evaluation, patience truly is key, and instantaneous reaction is the enemy of a successful ball club.
Look at the graph below which shows the trends in Lance Berkman's 2005 game-by-game batting averages. The zig-zag is his per-game batting average (1-for-4, 3-for-4, 0-for-5, etc). The dotted line is a linear representation of the overall trend in his batting average during the season. The curvy line is the most important one for this discussion though, as it shows the hot streaks and slumps that Berkman experienced during 2005.
Berkman's 2005 season contained two major slumps and two hot streaks. Though his season batting average was .293, at no point during the season did it hold steady at that level. Berkman's .234 May average coincided with a team-wide slump (he was hurt in April), and his .362 July average coincided with the Astros pulling out of that slump. Then in August, he dipped back down to .242 for the month, and then back up to almost .400 during September.
None of those variations show up in his cumulative season batting average, though. By the beginning of August, Berkman sported a season batting average around .300, and though it dipped to .280-ish by early September, his overall batting average hid the magnitude of his August slump because the slump occurred near the end of the season.
In May, we all knew that Berkman would pull out of it. He was a star, and that's what stars do, right? Nah. that's what baseball players do. All of them.
Example 1: Jason Castro
Non-superstars also go through the same peaks and valleys. To date, Jason Castro is hitting .221. But we know Castro isn't a .221 hitter. The problem is that his terrible slump coincided with the beginning of the year, where peaks and valleys most affect cumulative season batting average.
Fortunately, because of the magic of small sample sizes, it doesn't take much to put things aright.
Over 77 plate appearances this season, Castro has hit .221, at odds with his career .252 batting average. Let's pretend that Castro today begins the climb out of his slump and hits .350 for his next 77 plate appearances (possible due to the magic of SSS!). After those next 77 plate appearances, his season average now stands at .285! Less charitably, If he "only" hits .280 during a 77-PA hot streak, his season average will reach a more-palatable .250.
See how quickly things can change due to a hot streak?
Example 2: Chris Carter
Jason Castro was an all-star last year. How about somebody with a bit less shine, current whipping-boy Chris Carter? Through the first three weeks of the season, Carter hit an embarrassing .123, leading to calls for his head, or at least for his release. After just three weeks.
In week four, Carter hit .300, which raised his season average by a massive 46 points in only six games. If Carter continues to hit .300 next week also, his season batting average will be almost double what it was after week three. If Carter averages .240 for the rest of the season (possible, since he's done it before in the majors), his season batting average will stand at .220, just where it was last season.
It's irrelevant that .220 is still not good. A .240 average is great for a guy with Carter's power and patience, and what matters more than his overall season line is what he will do from this point forward, not how badly his .123 start affected his season-long batting average.
Example 3: Other Guys
The largest problem is that the entire Astros' lineup slumped at the same time to start the season. Was it the virus that attacked the clubhouse? Typical slow start for hitters? The lower run-scoring environment across baseball? It doesn't really matter. The important takeaway is that it was indeed a slump, and not indicative of how the team will perform going forward.
Why, just in the last week, Astros' hitters are showing signs of breaking out of the slump:
Alex Presley - .450/.450/.550
Chris Carter - .300/.417/.800 (I have nice dreams about him doing this all year, but that would ignore SSS)
Dexter Fowler - .308/.419/.462
Jonathan Villar - .333/.333/.619
Jose Altuve - .323/.364/.516
Matt Dominguez - .292/.370/.583
The Moneyball Connection
Small sample sizes and variability of stats over short time periods is why Billy Beane called the playoffs a "crap shoot" in Michael Lewis' Moneyball. Playoff scoring is determined by which batters are hot, not by overall talent level of the offense. Again, the 2005 Astros are a great example of this, as they were 24th in the majors in runs scored during the regular season, but hit .252/.329/.391 in the playoffs (which was actually one of the better offensive performances in the playoffs that season). Hitters' performances are too prone to the fickle nature of hot streaks and slumps to use their overall talent to predict success or failure in the postseason.
The word some of you have dreaded: regression. Fortunately, I'll keep this brief and simple. Regression is simply the theory that, over time, players will trend towards their long-term average performance level. From 2003 to 2006, Berkman averaged a .300 batting average. The principle of regression dictates that when he was hitting .230 in May 2005, the most likely future outcome was that his batting average would come up towards .300. Likewise, when Berkman was hitting almost .400 in June, regression dictated that it would come back down, towards .300 (note, I didn't say to .300 exactly). Over the long run, it settled at .298, very close to the .300-ish average he sustained during that portion of his career.
Think about a slinky. If you hold a slinky in your hand from the top, and let it go, it will bounce up and down until it settles at a level somewhere between it's lowest reach and it's collapsed state. That middle level is where it wants to be, and it will bounce lower and higher than that spot until it eventually stops moving. While statisticians would eviscerate me for that analogy, it still provides a good example of how things want to "regress to the middle". But every player (and every slinky) has a different middle.
Some tools can be used to help identify when regression is in order. As Seth showed in the article linked above, BABIP and xBABIP are two of the prime indicators that something is out-of-whack with other stats, as are batted ball rates. I won't go into any detail here on how or why those metrics help project future performance, other than to say that's what we're usually talking about when we predict regression and use it as the reason to defend or caution against certain players' current performances.
I hope this article helped explain how small data samples and the principle of regression can color the perception of a players' performance and future expectations. The Astros' hitters have experienced a team-wide slump to begin 2014, but regression indicators all point in the positive direction for the woe-begotten offense. We began to see some of this taking shape with the recent boost in hitting performance. Over the past week, the Astros are scoring 4.7 runs per game, compared to 2.7 in the nineteen previous games.
That, my friends, is due to regression.