clock menu more-arrow no yes

Filed under:

Baseball Statistics and You

New, 9 comments

I talk about John Sickels a lot. He runs a sibling-site here on SBNation, Minor League Ball. He also has published a prospect book for the last few years that is a gold mine of data on the minor leagues. Oh, and used to work for Bill James. That pretty much makes him a guy I'd be lucky to emulate.

Which is why I gave his recent post about sabermetrics more than a passing thought. I read all the responses to it. I read both stories over at The Hardball Times here and here. I read this one at BPro and the one over at THE BOOK. Did I mention the these two? It seems like I wasn't the only one to esteem John and to give his opinion merit.

I took my time to respond, though, because I wanted to give this more thought. Here is what I came up with. It's not the usual TCB post as it only tangentially touches on the Astros. Bear with me.

One of the reasons my wife hates discussing any esoteric topic with me is that I can usually see both sides and choose whichever fits my mood to argue. So, I see what Will Carroll is trying to say and agree to some extent. After all, I have been trying slowly to work more advanced metrics into my game stories for an actual, ink-on-paper news source. Imagine that? But, I also can see where Tango is coming from. Dropping everything down to the lowest common denominator often diminishes the value of the thing in the first place. By 'dumbing down' stat talk, you're not giving your audience enough credit. But, it's more than just that.

I’m about to do something that no scientific method allows; I’m going to make broad assumptions. For instance, I assume that you, the readers, enjoy my work and respect my opinions. Because of that, you expect me to be some sort of authority on what I’m saying. Basically, I should know what I’m talking about if I’m going to say anything at all. 

I try to think about that every time I write an article. Sometimes, I may ask rhetorical questions or try to get your input, but many times, I’m giving you my opinion on a story or circumstance and want that to be an informed opinion.

Essentially, David The Baseball Fan and David The Writer become two different people. As a fan, I can read the stuff that Bill James, Tom Tango and the rest of the sabermetric community are doing with interest. I can wander through lists of career Win Shares in the Historical Baseball Abstract and wonder how they got there. I can look at things like marginal run values or regression analyses and know there’s good stuff in there for me to understand the game better. Many times, though, I’m not quite there yet.

Don't get me wrong, I'm a geek. A Isaac Asimov-reading, computer programming, Star Trek geek. My skills just lie more on the programming side than they do math. I'd rather set up my excel table to do all the advanced calculations for me and worry over it just once.

Like John, I was a liberal arts major. I was only required to take two math classes in college and, thanks to a great AP Calculus teacher, tested out of both semesters. That’s right, I haven’t taken a math course since high school. I’ve never taken a statistics course. For a lot of these new metrics, I have to do some serious studying to refresh myself on what it is the math is saying.

At the same time, there is a lot of data out there that is simple and useful. Take the Pitch F/X stuff. I’ve seen some writers out there like JC Bradbury say that the Pitch F/X information hasn’t really yielded anything conclusive yet. For me, though, that couldn’t be further from the truth. I was worried that I wouldn’t be able to wrap my head around what the data meant. However, the stuff you can visualize from the info is much more accessible than I had expected.

With Pitch F/X, you can see where a pitcher throws the ball, what his pitches look like and how much movement they have, how fast he throws the ball and what his average speed is and on and on. Graphing the data turns it into something any baseball fan can understand and appreciate. Much like the GameTracker application, it’s an intuitive, graphic way to follow what’s going on. 

Those are the kinds of things I can write about. I also have found that calculating the stats myself really helps me get behind how the metric is put together. When I’m tracking minor league players, I have a spreadsheet that will calculate Runs Created, RC27, wOBA and even OPS+. That last one seems easy, but I’ve had to correct for park effects too. My minor league spreadsheet evolves a little more each season, as I get more comfortable using certain things, as I reject others as useless. Just like any good scientist.

But, ultimately, I’m required to speak about something to an audience. That's why it may take me longer to assimilate something like SIERA into my writing. Or to discuss whether I like wOBA or TrA better. They're both just a little bit over my head statistically right now, but give it time. I'll get there.

I know it's a saber-sin but David the Baseball Fan still gets excited for stuff like batting average and RBIs. A couple years ago, I was doing my minor league stat charting thing and saw this guy I hadn't heard of flirting with a .400 average for the month of April. A couple months later, I was disappointed to see Matt Cusick get sent to the Yankees for LaTroy Hawkins. That's what the stat community misses sometimes. Fans like following batting average, RBIs and home runs. It's easy to understand, there's a history to it that you miss with other stat races. No one talks about the time Barry Bonds posted an OBP of .609 like they do when Ted Williams hit .406.

At the same time, we, as analysts, should help people understand that those stats don't necessarily mark a player's talent. Batting average is influenced by things like BABiP. A better measure of how important a player is to a team is OBP and SLG. RBIs are tied more to the teammates higher in the batting order and the random chance of scoring opportunities. Making the distinction between a player's true talent and traditional stats shouldn't diminish either.

Take another minor leaguer I discovered and took quite a shine to a few years back. This guy didn't have great power or a great batting average. He did have an excellent OBP and didn't strike out much. Later that summer, he was traded to the then-Devil Rays for Aubrey Huff. Though I discovered both Ben Zobrist and Matt Cusick in similar ways, it was Zobrist's on-base skills that caught my eye. I had a hunch that Zobrist could be a good player; he turned into Zorilla. I thought Cusick's average skills were cool, but I'm under no illusions about his pro potential.

Maybe you TCB readers have come to similar conclusions. We're a pretty stat-happy group here. At the same time, it's nice to appreciate baseball's simplicity at times. I would like to hear from you, though. Leave a comment with your own love-hate relationship with statistics. What pulled you in? Where do you draw the line? Do math and baseball go hand-in-hand for you?