Just an initial note - the name is truly not meant to be offensive, more in the namesake of the old self-help style books to learn topics.
So I’m giving advanced warning that this may be a boring topic to some as it is a generally pretty dry topic. I don’t claim to be an expert in this area, and have no problem with (and even am asking for) people to comment to add what stats they utilize, and what the benefits of that stat is vs others. There are college courses that dig into the study and advancement of these predictive stats, which is another reason they are always advancing.
I have seen a lot of new faces on the board, and realized I wish someone would have made a general “guide” as to the stats used to evaluate. Obviously for a full page of details on any of these stats, their correlation to performance, or even how they’re calculated, use Fangraphs or another resource, the goal here is to stay high level. My goal is simply to explain the stats in layman’s terms
Why Advanced Stats
There’s an old argument that advanced stats are ruining the game. I personally don’t subscribe to this thought process, but do understand from a perspective of thought as they’re not what you see occurring in front of you.
Advanced Stats look to find the underlying reasons things occur to more accurately identify the player’s contributions and potentially improve our ability to predict the outcomes in the future for a player.
Sample Size Matters. A lot of these statistics take time to normalize. Just like flipping a coin a few times and getting heads doesn’t mean you’ll always get heads – the larger the sample size, the more accurate the data.
Regression is NOT necessarily a bad thing. It simply means that the statistic will start getting back towards a normal number across a larger sample, that can a positive or negative thing for the player.
Pitching Stats – Traditional
Wins – There are few stats that result in more ire with people who look deeper into the numbers of baseball. There is an argument that the pitcher “did what it took” to get the win, regardless if he let up 1,000 runs as long as his team gets 1,001. The obvious flaw is that the pitcher does not control the offense and is rewarded or penalized more on team performance than his own.
Save – Basically the closer version of win, similar flaws, generally is not a reflection of pitcher’s performance.
ERA – Earned Run Average. This stat provides a simple record of how many runs that scored were charged against the pitcher. They are responsible for batters who reach base without an error being committed – thus the pitcher is responsible for “earning” them. ERA does not account in any way for the defense behind the pitcher other than errors.
ERA+ - This stat utilizes ERA and works to take out league and ballpark differentiators and create a comparison of all players with 100 being average.
K/9 & BB/9 – straight forward stats – how many strikeouts or walks did the pitcher have per 9 innings pitched. This stat is better than looking at a pure K or BB count as it does factor in the number of innings but doesn’t have any factors to how many people are getting on base.
WHIP – Walks & Hits per Innings Pitched – This is a counting stat, simply how many people did the pitcher allow on base via walks or hits divided by number of innings they pitched. It does not take into account anything regarding the defense behind a pitcher.
Pitching Stats – “Advanced”
FIP – Fielding Independent Pitching – This stat looks at 3 true outcomes – removing fielding from the equation to focus on the pitcher’s direct performance.
xFIP – Expected Fielding Independent Pitching – The primary difference between xFIP and FIP is that xFIP regresses the home run rate on flyballs to a normal level. This was done primarily due to them finding that the reduction in home run rate was not a repeatable skillset for pitchers and varied greatly year to year.
SIERA – Skills Interactive Earned Run Average – SIERA is similar to FIP in regards to the focus on aspects within the Pitcher’s control. The primary difference is that they take into account what type of ball in play occurs (ground balls, line drives, fly balls, infield pop ups, etc).
BABIP – Batting Average on Balls In Play – This is a commonly cited stat nowadays and is a precursory look on how “lucky” a person may have been. It looks at what percentage of balls that were hit into the field of play went as hits. Homeruns are not included in this. It is an excellent quick check to see if the results are sustainable, but is flawed in that it does not dig deeper into the type of contact.
LOB % - Another stat attempting to quantify “luck” and repeatability. Generally, it will show if an abnormally high or abnormally low number of players that reached based went around to score. League average is roughly 72%. High strikeout pitchers have had a greater ability to control this statistic, but again is used for a gut check to see if the results were sustainable.
wOBA – Weighted On Base Average – I want to explain this to get to xwOBA, but the goal of wOBA is to create a value per plate appearance based on how the player got on base, and assigning a value based on the ability of each way to score runs (walk being worth less than a single, single less than a double, etc.)
xwOBA – Expected Weighted On Base Average - this is a newer stat and is in some ways still in the early stages. xwOBA takes wOBA to the next level, using statcast and assigning hit probabilities with the associated values of wOBA based on launch angle, exit velocity, etc it gives you an overview of value of the value per at bat based on how the balls were hit. (How hard, what angle, etc). xwOBA is useful for pitchers – but is formatted in a style similar to a triple slash instead of an ERA Format.
K% - Another pretty straight forward statistic, which looks simply at what % of players that came up to bat the pitcher struck out. It has a much stronger correlation to how well the player pitched than K/9. It ultimately penalizes players that would allow a high WHIP or high BABIP
WAR – Wins Above Replacement – A single stat that attempts to capture the total value the player provided in comparison to a “replacement level” player. It is calculated differently by different sites as each works to provide the most accurate representation possible.
So you want to evaluate a pitcher?
The above does not list all important stats but can give you an initial glance to look at a pitcher.
You find a pitcher – let’s use Framber Valdez
4-1, 2.19 ERA, 37 IP, 8.27 K/9, 5.84 BB/9, 185 ERA+, 1.243 WHIP
Looking at this stat line in the tradition sense, you’d think Framber was an ace, with erratic control but strong stuff. His WHIP should raise red flags. Hopefully you’d recognize 37 IP is a small sample size, and the results are still very volatile.
4.65 FIP, 4.26 xFIP, 4.50 SIERA, .213 BABIP, 87.3% LOB, .282 wOBA, .275 xwOBA, 22.1% K% / 15.6% BB%, 0.1 fWAR,
Okay I know I just threw a lot of stats out there. Initially what I see is a HUGE difference in his ERA compared to his FIP, xFIP and SIERA, meaning advanced stats don’t believe the results are sustainable based on how he was pitching. He had a very low BABIP and high LOB%, both which give red flags of potentially being “lucky”. The wOBA vs xWOBA seems in line.
Looking at him as a pitcher, you have to think, if he pitched exactly how he did last year, he would not come close to achieving those results on a consistent basis. The only caveat I have here is that 37 IP is still a very small number to try to judge a pitcher and gain a baseline on. Interestingly enough, looking at his 2018 minor league results across 103 IP, show almost the polar opposite, an ERA over 4 but a FIP at almost exactly 3.
Tell me – how do you evaluate pitchers?. What stats or important aspects did I miss? Do you have a clearer way to explain one of these stats? Please do!