clock menu more-arrow no yes

Filed under:

Astros Statistical Engine Update

New, 7 comments

One of the things that I've been doing over the last few days while first the Padres and then the A's were busy coming up empty with runners on base has been updating my copy of Ray Kerby's Astros Statistical Software with the numbers from the 2006 season.  

If you've never heard me wax poetic about how great the software is, by all means let me go ahead and do so now.  Originally, but no longer, available through the Astrosdaily web site, and written by that site's founder Ray Kerby, the software is quick, intuitively easy to understand, powerful, and is now, and always has been, free.

As such, as long as you're only interested in numbers for the Houston franchise, it blows the doors off the often-praised, but overrated Lee Sinins' Sabermetric Encyclopedia, which I have found to be both difficult to use, and possessed of a purchase price much greater than free.

Anybody who's come here more than once or twice is probably aware how much I love to crunch numbers to come up with silly, meaningless lists, and lemme tell you, there'd be a lot less number-crunching going on without this nifty software.

Unfortunately, the software and updates are no longer available through Astrosdaily, so for the past several years I've kept the software at my Astroland site, and have done the updates myself during the offseasons.

I've now got a link to both the software and the stats on the right sidebar under "Resources" and you should take a look.

Although I'll admit sometimes, the numbers can raise more questions than they answer.

Like check it out:  I was making sure everything was working right with the update, and ran a report on 2006 pitcher's batting average against:

 Yr    Name                 BAA 
2006 Chris Sampson         .205 
2006 Russ Springer         .211 
2006 Roger Clemens         .216 
2006 Dan Wheeler           .221 
2006 Trever Miller         .225 
2006 Brad Lidge            .238 
2006 Fernando Nieve        .242 
2006 Chad Qualls           .242 
2006 Taylor Buchholz       .248 
2006 Dave Borkowski        .257 
2006 Brandon Backe         .261 
2006 Roy Oswalt            .263 
2006 Jason Hirsh           .267 
2006 Andy Pettitte         .284 
2006 Wandy Rodriguez       .290 
2006 Ezequiel Astacio      .292 
2006 Matt Albers           .298 
2006 Mike Gallo            .400  

Maybe no surprises there, except what's that with Lidge and the .238 BAA?  Isn't that surprisingly good for someone with a 5.28 ERA?

And then I made another one of those scatter graphs, and it looks like it might just be:

Click to open larger graph in new window

It DOES kinda look like Lidge is off kilter, there, huh?

Well, I thought, maybe this could be explained by the possibility that a higher percentage of Lidge's hits allowed were of the extra base variety. In other words, maybe he didn't give up more hits, but the hits he DID give up were more damaging, thus inducing the higher ERA.

Well, that also doesn't appear to be the case . . . Lidge's ratio of extra base hits to hits overall doesn't seem skewed in either direction.

Yr    Name                  H    XBH    H/XBH
2006 Chris Sampson          25    5     .200 
2006 Mike Gallo             28    7     .250 
2006 Matt Albers            17    5     .294 
2006 Chad Qualls            76   23     .303 
2006 Roger Clemens          89   28     .315 
2006 Roy Oswalt            220   77     .350 
2006 Andy Pettitte         238   84     .353 
2006 Brad Lidge             69   25     .362 
2006 Brandon Backe          43   16     .372 
       .             .                .
       .             .                .
2006 Ezequiel Astacio        7    3     .429 
2006 Dave Borkowski         70   31     .443 
2006 Fernando Nieve         87   39     .448 
2006 Dan Wheeler            58   26     .448 
2006 Taylor Buchholz       107   50     .467 

OK, maybe that's not the way to look at it, either. For example, almost half of the hits Dan Wheeler did allow were of the extra base variety, but overall, he did a good job of keeping people off base.

So maybe I should be looking at extra base hits per at bat:

  Yr Name                  H     XBH    XBH/AB 
2006 Chris Sampson          25    5      .041 
2006 Roger Clemens          89   28      .068 
2006 Chad Qualls            76   23      .073 
2006 Russ Springer          46   18      .083 
2006 Trever Miller          42   16      .086 
2006 Brad Lidge             69   25      .086 
2006 Matt Albers            17    5      .088 
2006 Roy Oswalt            220   77      .092 
      .          .             .
      .          .             .
2006 Dave Borkowski         70   31      .114 
2006 Taylor Buchholz       107   50      .116 
2006 Ezequiel Astacio        7    3      .125 

So this makes it look as if Lidge is pretty good at limiting the extra base hit. But if he limits the extra base hit well, and has a good to decent batting average against, then how do you explain--other, of course, than with the evidence we saw with our eyes during the course of the season--the audaciously bad ERA?

Here's the staff ranked by OPS against, but I dunno, maybe (and I can't believe I'm seriously suggesting this) Lidge was somewhat the victim of plain bad luck.

Yr   Name                 OPS A
2006 Chris Sampson         .535 
2006 Roger Clemens         .596 
2006 Dan Wheeler           .649 
2006 Trever Miller         .671 
2006 Russ Springer         .672 
2006 Chad Qualls           .695 
2006 Roy Oswalt            .702 
2006 Brad Lidge            .736 
   .              .            .
   .              .            . 
2006 Jason Hirsh           .849 
2006 Ezequiel Astacio     1.017 
2006 Mike Gallo           1.048
Anyway, whatever the answer to this Lidgian conundrum, and whether you think this whole Lidge thing is a freaking wheelspin, because he obviously walked too many batters, and plunked too many, to boot, these are the kinds of things you can look at with the statistical engine, and what a tremendously fun toy it is.