/cdn.vox-cdn.com/uploads/chorus_image/image/45196192/usa-today-7985588.0.jpg)
It's that time of year again, when various outlets publish their projections for the performances of baseball players for the upcoming season. As usual, Steamer has beaten everybody out of the gate, as its projections have been published over at Fangraphs and continuously updated as the off-season unfolds. Then Friday, friend of TCB Dan Szymborski sent over his ZiPS projections for the Astros to Fangraphs, where they were published with the thoughts of Carson Cistulli.
Despite that projection systems have been out for over a decade, still the general public misunderstands them and uses them to get their dander up prior to the season. What? Chris Carter projected for only 30 home runs? What kind of crock is that?
Baseball projection systems, in a nutshell
Projection systems such as ZiPS are not prediction systems. They don't look into a crystal ball and forecast what a player will do during the course of the upcoming season. As such, they should not be taken as gospel, but rather as general pictures of a player's skill level. They are estimates, born of statistical analysis comparing a player's recent performances with a database of historical players.
Here's how baseball projection systems work, generally. They find players who have had comparable seasons at comparable age, regress things such as BABIP and HR/FB that are known to regress towards a player's career mean, and then perform hundreds of simulations per player, enacting a faux-2015 season based on the player's skill set.
After all that number-crunching, the end result that is printed on Fangraphs for ZiPS, such as Carter's 70-30-85-4, .228/.316/.466, represents the 50% average of all of that player's simulations. This is the important part. The 50% average suggests that, all other things being equal (luck, health, no unexpected change in approach, development, learning a new pitch, eye surgery, etc), this is a baseline around which a player is expected to perform during the upcoming season.
Limitations of projection systems
Projection systems are useful for establishing a base line, but what they can't do is anticipate the unexpected, and so a low projection for a favorite player should not be cause for undue alarm. However, his projection should be respected as a very likely possibility, since it is rooted in concrete historical data.
For example, prior to 2014, ZiPS projected Dallas Keuchel's 50% baseline as 5.02 ERA over 156 innings pitched. This actually acknowledged that Keuchel should improve over his prior season ERAs of 5.27 and 5.15, but that his past performance and historical comps pointed to him being a below-average pitcher. But ZiPS could not have anticipated that new pitching coach Brent Strom would advise Keuchel to stop throwing his curve ball and to use his four-seamer half as often as in the past. The results of coaching and proper implementation of a new approach led to Keuchel's 2.93 ERA in 2014; it was not a failure of ZiPS to project Keuchel, rather it showed that Keuchel and the Astros were able to make a change, breaking Keuchel out from the mold of the historical pitchers that used to compare to him.
Such a variance was true of all projection systems last year. Baseball Prospectus' PECOTA forecasted that in the top-performing 10% of simulations, Keuchel would only manage a 3.92 ERA in 128 innings pitched. Such is the power of development. To be fair to ZiPS and PECOTA though, it is important to point out how rare a breakout such as Keuchel's is. Historically, pitchers with his 2013-2014 profile continue along the path of non-development and a short career.
ZiPS can be wrong in the other direction too, much to the dismay of fans. Prior to 2014, we all probably hoped that Matt Dominguez would outperform his projected baseline of .249/.294/.395 and 63-17-74, right? Sadly, for whatever reason, he wimped his way into a .215/.256/.330 line. It happens.
By-and-large though, ZiPS and the other established projection systems are reliable snapshots of a player's level of ability, and have a credible level of accuracy. The under- and over-estimations cancel each other out over the long haul. So the key is to identify for individual players where there might be some circumstance invisible to a database that could cause a player to miss his projection in either direction.
Some Thoughts on 2015 Astros' ZiPS Projections
- The negative first. Jason Castros' 2014 batting line of .222/.286/.366 was supported by a .294 BABIP. He walked less than in the past, and ZiPS thinks he will improve in that direction. However, it's pretty easy to make the case that Castro's ZiPS projection of .242/.313/.411 is a little optimistic, as it's built in part on his excellent 2013 season. Same with Matt Dominguez, projected to improve to .239/.283/.374. Steamer is less enthusiastic on both men, and short of a change in approach as I discussed in my examination of Dominguez, it's hard to buy those projections.
- Similarly, Marwin Gonzalez' dip back into the performance level expected of a role-player is believable due to his high 2014 BABIP and doubling his 2013 wRC+. ZiPS is actually kinder than Steamer, and projects him for an acceptable line for a backup: .251/.293/.353.
- Neither projection system thinks much of Jake Marisnick's bat, at .239/.285/.361. Astros fans have his impressive minor league career to dream on though, and he's only 23 years old. And dat defense, doe. Because of his defense, ZiPS still projects 1.2 WAR.
- Before 2014, ZiPS projected 16 WAR from the Astros' everyday lineup, rotation, and relief corps. The 2014 Astros actually reached 24 WAR, mostly due to leaps forward from Keuchel, Collin McHugh, and Jose Altuve. Their aggregate WAR was held back by significant negative contributions from Jon Singleton, Matt Dominguez, the pre-all-star break bullpen, and Marc Krauss.
- Before 2014, ZiPS projected 16 WAR from the Astros' everyday lineup, rotation, and relief corps. Going into 2015, ZiPS projects 26 WAR.
- ZiPS projects some negative regression from Altuve, Keuchel, McHugh, and Scott Feldman. This is expected, since ZiPS accounts for (and weights) data from multiple previous seasons, not just the most recent. For those first three to beat their projections, they will need to prove that 2014 was not a complete fluke, as some of us believe. Since in all three cases, the breakouts coincided with a change in approach, there is reason for optimism. In Feldman's case, ZiPS reasonably expects a return toward career walk and home run rates, which will increase his ERA and FIP.
- ZiPS projects positive regression for some likely candidates - Dominguez and Singleton. This recognizes that the first two suffered from cripplingly-low BABIP during 2014, which should be correctable.
- Candidates for beating projections - Singleton, Grossman, Marisnick, Carter. Grossman, Marisnick, and Carter all have several seasons' worth of major league data to input into a projection system. Grossman and Marisnick though, both had strong minor league careers, and their playing time has been sporadic, interrupted, and inconsistent. Grossman has already shown that he can sustain success over an extended period (.262/.357/.349 post-ASB last season) and ZiPS probably can't see that Marisnick was rushed to the majors before his development showed that he was ready.
- Based on minor league stats alone, ZiPS projected prior to 2014 that Singleton could hit .233/.325/.398 in the majors as a 22-year-old. He failed to meet that level of production, and so his ZiPS projection for 2015 is a less-enthusiastic .218/.321/.416 at a more advanced age. However, ZiPS can't know about the 50-game suspension and Singleton's struggle with rehab and how that may have affected his performance in 2014. Given a full year in 2014, the low BABIP, and the fact that ZiPS once thought he was a better player than it does now, fans should have every reason to expect him to outperform this projection.
- George Springer's top comp is Greg Vaughn, who once hit 50 home runs with a .272/.363/.597 slash line. Last season, Springer's top comp was Mike Cameron. In reality, Springer could pair Vaughn's power with Cameron's base stealing and defense. Astros fans should be very, very excited.
- Springer's projected counting stats are likely low, as he's only projected for 493 plate appearances due to his missed time in 2014. If Springer reaches the 610 PA threshold projected by Steamer, his ZiPS line would be 100 runs, 36 HR, 103 RBI, 24 stolen bases. Remember that this is only a 50% likelihood projection and that it is possible for him to beat this projection. It makes one shiver. How does 110-40-120-35 sound, Astros fans? Technically, given this projection of baseline talent, it is possible.
- If he were to get major league playing time, ZiPS thinks that Carlos Correa would hit .247/.311/.357 as a 20-year-old. That's for higher wOBA than Marwin Gonzalez, Gregorio Petit, Jonathan Villar, and Matt Dominguez. ZiPS might be implying that the Astros might win more games with Lowrie at 3rd and Correa at short. Correa's top comp right now is former Astros' shortstop Dickie Thon, which is just kind of fun. Astros fans should not call for Correa to play in 2015 though...with another year of development, his top comp might be Alex Rodriguez or Troy Tulowitzki. Patience is a virtue.
- ZiPS doesn't agree with me on Oberholtzer's pending breakout. I look forward to rubbing it in Mr. Szymborski's face when Oberholtzer becomes a 3.50 ERA pitcher in 2015.
- Castro/Conger's collective WAR at catcher does not include wins added through elite pitch framing. Since pitch framing is the topic du jour this off-season much the way pitch tunneling was last season, I felt the need to throw that in.