Nothing like starting out with a headline question which I can't possibly answer with any certainty. Perhaps in keeping with my nom de plume, we will be "looking under the hood" at the Astros' win / loss record---in a statistical way, of course. Baseball analysts may seem overly concerned about discerning whether a team or player over or under performed. But, given concepts like regression to mean, this is an important piece of information if you want to project the probability of a team improving in the future, or understand whether a team's roster should be reconstructed, or just tweaked, during the offseason.
Most of the regular readers at TCB are familiar with the concept of "Pythagorean Record." (And, no, it doesn't mean that ancient Greek mathematicians played baseball.) If you aren't familiar with it, you can look here, here and here for the definition and some critical discussion. The Pythag, as we sometimes call it, predicts the likely winning percentage of a team based on runs scored and runs allowed. Some would argue that a team which beats its Pythag projection was lucky and probably is due for some reversion of its win record in the future. However, this conclusion is disputable, given that some characteristics of team construction (a good back end bullpen, for example) may help a team produce a winning record greater than projected by Pythag. Some teams seem to consistently over-achieve their Pythag; the Angels are frequently cited as Pythag-beaters in the Mike Socia era.
The Astros significantly overperformed their Pythag in 2010. The Pythagorean projection used by Baseball-Reference.com projects 68 wins, rather than the actual 76 wins, based on run differentials. An 8 win over-performance is fairly sizeable, as these things go. The Astros have a recent history of overperforming their Pythag projection: 2 wins over in 2009; 9 wins over in 2008; and 1 win over in 2007. Some might say that the Astros have just been really, really lucky. Others will look for possible team-specific reasons: Is it the bullpen causing a good record in close games? Is it the manager? Is it some other element of team construction? The sole determinant of Pythag expectations is the distribution of runs and runs allowed on a game by game basis during the season. A couple of years ago, AstroAndy did some fine work looking at improving the Pythag method by changing the way that outliers are handled, which led to some good discussion on the topic at TCB. I don't intend on retracing the arguments regarding the reliability of the Pythagorean record as a measure of over or under performance.
What about other methods of projecting team wins and losses? The Pythagorean method is an extremely high level view of the team's record, since the only data it relies upon is the run differential during the season. If a projection based on more detailed performance data is closer to the Astros' actual record than the Pythag, that might ease some concern, raised by the Pythag, that the Astros win record is the result of a lucky distribution of runs.
Although OPS (on base percentage plus slugging) is not a cutting edge statistic, OPS originally was justified because it tracked run scoring better than traditional statistics like batting average and home runs. While admitting that OPS is not an ideal starting point for predicting winning record, this article expounds on variants of formulas to predict winning percentage based on team OPS and OPS-allowed. I have used the version developed for teams with negative run differentials, which has a statistical reliability comparable to more complex win prediction methods:
W% = 1/(4*OPS Allowed/OPS - 2)
Applying this formula to the Astros' hitting and pitching OPS produces a win percent of 45.6%, or 73.86 wins, which we can round to 74 wins. The Astros projection based on OPS is two wins short of the actual 76 win record--an overperformance, but much less so than the Pythag indicates.
A team can be more or less efficient than average in: (1) converting hits and base runners into runs; (2) preventing hits and base runners from turning into runs; and (3) distributing runs scored and runs allowed. The Pythagorean methods focus on a team's efficiency with respect to the third issue. The OPS based projection moves our inquiry to the first and second issues. The unanswered question for all three issues is whether the team's efficiency or inefficiency is random luck or caused by factors related to the team's composition.
I am particularly interested in the team's efficiency in converting OPS into runs. To analyze this, I compare the Astros' OPS and OPS-against to the league averages for OPS and OPS-against, as well as league average runs scored and runs allowed. I have used the rule of thumb that the run impact is twice the OPS difference; for example, a 10% difference compared to the league average OPS is expected to produce a 20% difference in runs scored. Based on this comparison of hitting and pitching OPS, the Astros scored 36 runs more than expected and allowed 10 runs more than expected, for a net "overperformance" of 26 runs. This is equivalent to 2.6 wins more than expected. The 2.6 win overperformance, calculated in this manner, is close to the 2.2 win difference between the OPS Win% formula prediction and the Astros' actual win record.
This would seem to corroborate the theory that the Astros' overperformance is based on the offense's above average efficiency in converting OPS into runs. Possibly this is due to random luck. However, it is also possible that other factors can explain this efficiency. I'll run through some hypotheses below:
Base Running Efficiency
Because OPS doesn't consider base running production, base running has to be a examined as an efficiency booster. However, according to Bill James' base running analysis, the Astros are roughly average in team base running. Although this is a nice improvement from the Astros' performance in 2009, it doesn't seem likely to be a source of higher than average run scoring efficiency.
Clutch Offensive Performance
If a team is above average in OPS performance during critical situations, the efficiency of run production will be higher than average. There is some evidence that the Astros used clutch offense to increase their efficiency, depending on how "clutch" is defined. The Astros had a .739 OPS in high leverage situations, which is higher than the league average of .721 in high leverage situations. The Astros' OPS increased substantially in those situations, while the league OPS declined slightly in high leverage situations.
Productive Outs / Situational
Advancing runners and making productive outs are intended to increase efficiency in scoring runs. The Astros had the 4th best rate of productive outs in the NL. The Astros had the 5th best sacrifice bunt rate, a statistic which includes the contribution made by pitchers. The Astros had the fifth best rate of scoring the runner from 3d base with less than 2 outs. While these are all positive factors for improving run scoring efficiency, we don't know if the impact is significant. The Astros have a poor GIDP rate, and that could offset the beneficial impact of productive outs.
Contact Rate / Groundball Rate
The Astros are a high contact team, with the third lowest strike out rate, as well as a groundball hitting team, with the league's highest groundball to flyball ratio. These characteristics may contribute to the high productive out ranking. The combination of contact hitting and groundballs tends to depress the OPS. However, the groundballs and contact tendency increases the batting average on balls in play which can be sustained, and possibly enhances productive outs and clutch hitting.
As I promised, I don't have the ultimate answer to tell us whether the Astros' record is lucky. My conclusion is that the Astros' record doesn't exceed expectations as much as the Pythagorean projection indicates. There are viable suggestions that the construction of of the Astros team may explain some or most of the overperformance. However, I am still concerned that the Astros face some level of risk that their win record will regress without additional improvement in the team. Whether that risk is 1 or 2 wins, or even more, is difficult to quantify. To some degree, it may depend on whether one views the Astros' situational offensive performance as sustainable.