Traditional pitching statistics are pretty much useless when it comes to evaluating and helping us predict the performance of a pitcher – especially wins.
To begin, let’s take a more detailed look at what goes into making a pitcher successful in the first place. The pitcher is one of nine defensive players on the field. He is responsible for starting play by standing 60 ft. 6 in. away from the batter and delivering the ball to him. His goal is to get the batter out. He can throw strikes that are actually strikes, but his movement or speed fool the hitter into missing the strike. He can throw a strike that isn’t a strike, but again, his movement could fool the hitter or it could be a deceptive arm-angle that fools the hitter. The hitter could take the pitch that pitcher pitched and put it in play. The hitter might get good contact through a variety of variables all coming together and he could hit a solid line drive through the gap and notch a double. Perhaps on that play though, the center fielder had cheated just a little and was able to make a Sports Center worthy diving catch to make the ball put in play, and an out. Of course the hitter could have just meekly grounded to the SS, resulting in an easy 6-3 ground out. Yet, the SS could have a momentary lapse in concentration. Perhaps he’s having marital problems or just really has to pee, and as a result of whatever is on his mind, he bobbles the ball and the meek ground ball turns into an E6. The pitcher, in the opinion of the official scorer, probably earned an out on that batter, but because the SS had to pee, now has no outs, and a runner on first base.
I’m not going to go back and count, but there are a lot of variables that going into a pitcher getting a single batter out. A lot of which end as soon as the pitcher releases the ball from his hands. So it makes little sense that we put so much stock into a pitcher’s W-L record. Just last year, the Cy-Young race was just as controversial as a West African presidential election, because 20 game winner Josh Beckett trumped 19 game winner CC Sabathia. In 241 IP Sabathia struck out 209 while only walking 37. He was responsible for almost a 1/3 of all of his outs. Beckett threw 200.7 IP Beckett struck out 194 batters while walking 40. Again responsible for about 1/3 of his outs. The difference I instantly see is that while Sabathia and Beckett were clearly two of the best in the business, the Red Sox had to use a lesser bull pen arm in 40.3 innings more then the Indians did. Sabathia is more valuable than Beckett in those terms alone. Yet, pundits everywhere were crying afoul because of that one win that separated them.
So what goes into to a pitcher winning a game? Well, take that first paragraph and multiply it up to as many as 50 times. Only sprinkle in fatigue for the pitcher and the ability of the hitter to better recognize a pitcher’s guile as the game progresses. Also, you have to have your team score more runs than the other team, before you exit the game, and then trust your lead in the hands as up to as many as five different relievers – other wise you’re heading for a no decision. Oh yeah, you have to complete 5 IP. Why five though? I can’t answer that. In that exercise, how much responsibility does a pitcher have for a win? Especially the run-scoring for the two AL pitchers who never hit?
Let’s isolate the run support issue. In 2007 Sabathia’s Indians provided him 5.10 Runs/9 in his starts. That’s not how many runs that Sabathia got in the innings while he was the pitcher of record, but it’s a best I can do. Josh Beckett, of 20 Win glory, had 6.42 Runs/9 in his starts from the Red Sox. Meaning that Beckett didn’t even have to be as good to earn a win as Sabathia did, but he could only muster one more win.
This doesn’t even to begin to say who had the better bullpen support. We’ll skip the nuances of measuring that for now, but it’s pretty straight forward. How many times can we recall Oscar Villareal blowing a lead this year? Or remember the time when Wesley Wright came in a game with 3 on and 1 out, but got us out of the inning with only one run given up? He converted a 2.42 Run Expectancy into a 1 run performance and saved 1.42 runs from scoring. Those 1.42 runs weren’t even his responsibility, but he saved them anyway. That’s the level of inane-ness that evaluating starting pitchers on wins is provided when you focus it through the lens of bull-pen support.
So a Win is certainly a very poor measure of how to evaluate a pitcher. I believe I’ve made a case for it, and I hope it makes sense to you. So how then do we then measure a pitcher’s performance if Wins an inept tool? To that effe ct, a very valid tool developed by Baseball Prospectus is the Support-Neutral Statistics. "The Support-Neutral name comes from the fact that [Baseball Prospectus] is removing, or neutralizing, the variability of different levels of run support and bull-pen support...This gives a truer sense of how well a pitcher performed, without the distortions of offensive and defensive support." (Baseball Between the Numbers: Why Everything You Know about the Game is Wrong, 2007 pg. 52). It works like this: say Roger Clemens went 7 IP of shut out baseball, BPro would then take that performance and assign it to a hypothetical league-average team and see how many times a league average team would win given that performance. It turns out, that is 85% of the time. So Roger Clemens earns .85 SN(W) and .15 of a SN(L). These are the same things as the E(W) for those of you who look at some the BPro statistical reports. While they are not the perfect tool for analyzing a pitcher’s performance, they certainly come closer to analyzing how much of a pitcher’s performance went into earning a win. Even there though, there are limitations. These will be discussed in DIPS and FIP section.
So, 7IP of shut out baseball is actually worth about .85 of win, if we exclude defense backing the pitcher from this analysis. Now, I think every Astros fans can hearken back to 2005, when Roger Clemens went 13-8 on the strength of a 1.87 ERA. How could he have possibly gone 13-8 with that ERA? Because the Astros only averaged 3.43 Runs/9 in his starts. Roger Clemens missed out on the Cy-Young that year, in spite of the fact that he was clearly the best pitcher in baseball, because he was deficient in an asinine and almost entirely luck based statistic. So the next time you here Steve Philips, Joe Morgan, or Ed Wade talking about how many decisions a pitcher has won as a basis for defending an acquisition, you should bristle with indignation (Yes, I was talking about Randy Wolf). If that’s the only good thing they can say about a pitcher, then they’re telling you he’s effectively worthless, but he sure did get a lot help from the bats and gloves backing him.
Just to make it concrete. Knowing a pitcher’s winning percentage has a year to year correlation of .202 in predicting his future performance. For those of you have forgotten your Stats 101 (I had to Wikipedia it so don’t feel too bad) Correlation measures the linear relationship between two variables. In this case Win Percentage one year, to the next. Correlation coefficients range from -1 to 1. -1 means that there is an opposite relationship -- high one year predicts low the next year. 0 means the two variables are completely unrelated and knowing one tells you nothing about the others. 1 means that there is a lot of stability in predicting the variable from year to year, given the first (click the .202 link for better explanation then what I just paraphrased). In general Correlation co-efficient less than .3 any direction are weak and pretty meaningless. .6 to .7 marks the statistically significant level, but that won’t be important until later. Hopefully my convoluted explanation of correlation coefficients drove this point home: Wins are not a valuable tool because there is no demonstrable skill at pitcher’s Winning Percentage cannot be predicted from year to year. Why is this? It’s because there are more factors that go into getting a winning decision that aren’t in a pitcher’s control then there are in a pitcher’s control.
Next time (when we look at pitchers again), we’ll look at ERA and how valuable of a tool it is or is not at determining a pitchers performance and why looking at a pitcher’s rates stats paints a much better picture.