clock menu more-arrow no yes mobile

Filed under:

Using Sabermetrics To Evaluate Astros Hitting Prospects

As an Astros fan who follows the team's prospects, one thing I tend to struggle with has been evaluating the organization's minor league players using sabermetric statistics.  There are many reasons this is harder for the minor leagues than the majors. Foremost is that the amount of data we get from minor league games is relatively limited, and in some cases, unreliable (for instance, batted ball data: What's a line drive and what's a fly ball?).

We can't do anything about data we don't have, but there are other important issues we can tackle, given a little effort.  For instance: the age of the prospect in question.  When a 26-year old player bats .320 at Triple-A, it might draw some attention, but it isn't nearly as exciting as a 20-year old doing the same thing.  You must adjust for age.  (The age of a prospect tends to be much more important for hitters than for pitchers, which is why we will only discuss position players in this article.)

Another issue is the drastic difference between the run environment from one league to the next.  For instance, the California League is notoriously hitter-friendly, while the South Atlantic League is very pitcher-friendly.  These differences must be taken into account as well; it's not unusual to see a California League hitter suddenly "break out" and his performance skyrocket there, then suddenly collapse back to his usual numbers at the next level.

Advanced sabermetric statistics are also hard to find listed for minor league players, necessitating some math to get more than basic OPS (on-base plus slugging percentage).  Relying on OPS introduces significant noise into the assessment of a player's performance, because while it's a lot better than traditional statistics like batting average and RBIs, it is still a crude statistic, for reasons I will explain below.

After the jump, you'll find a chart of the 2010 mid-season statistics for 24 Astros prospects, with my attempt to tackle the above problems.  The prospects are grouped by level, and are ranked by their performances adjusted for age, as compared to the average offensive performance of all hitters in their respective leagues.

Criteria for inclusion

To be included in the below list, a prospect must be younger than his league's average age.  He must also have at least 150 plate appearances in that league this season.  This is a crude way of eliminating "non-prospects" and players without sufficient sample sizes this season to draw any meaningful conclusions about their performances.

Problem Areas

Note that there are some inaccuracies in the process used to generate this chart.  League average age and performance are from recent seasons, not this one.  Players' ages are inexact, and go by Baseball-Reference's listing for this season.  Please also note that math has never been my strong suit, so there's always the possibility of operator error.  Keep in mind that the adjustments below are rough; the idea is to improve upon simply looking at a prospect's OPS, not to perfect the process.

When looking at the chart below, remember that a prospect is expected to perform well above average; to make it to the big leagues, you must be much better than the average minor league player.  Also, the below evaluation doesn't take into account positional value.  An above-average hitting shortstop is more valuable than an above average hitting left fielder.

Most of all, don't take this as a ranking of the value of Astros' hitting prospects.  It's only a half-season sample size, and doesn't factor in the rest of their careers.  Furthermore, statistical analysis is not used by itself to evaluate a prospect in the minor leagues; scouting is generally considered just as (if not more) important.

Instead, view the below chart as a way to better judge which prospects have helped their stock this season with their production, and which have hurt it.

Gross Production Average (GPA)

While I would prefer to use a more advanced stat like Weighted On Base Average, the calculations are complex and include stats I don't have access to (without making even more calculations).  Instead, I opted to use Gross Production Average, which is almost identical to OPS, except for two critical differences: First, it weighs on-base percentage much more accurately.  OBP has approximately 80% more value point-for-point than slugging percentage, but in OPS, each point has the same value between the two statistics.  Second, GPA adjusts to a batting average scale, in which .200 is the "Mendoza Line", .250 is approximately average, and .300 is excellent.

If you need help understanding on-base percentage (OBP) and slugging percentage (SLG), Wikipedia is your friend.

Age Adjustment

To adjust GPA based upon the prospect's age, I simply increase their performance by the same percentage as the number of years they are below the league's average age (based on their approximate age as listed at Baseball-Reference).  One year is 3.6 percent of the average age in the Pacific Coast League, so a prospect one year younger than that age would have his GPA performance multiplied by 1.036.

Remember that the listed prospect ages are inexact, and as such, so are the age adjustments.

Show Me The Chart, Already!


*AAGPA is a self-invented acronym for "Age-Adjusted Gross Production Average".  The number listed is how much above or below the league average a player's GPA is.  Higher is better.

**All individual player stats as of June 23, 2010, per Baseball-Reference.

***Thanks to The Hardball Times for the average offensive numbers for each league, and Scott Lucas for the average age of players in those leagues.