We have had a few stories about Baseball Prospectus and both team and player projections. In commenting on BP's team projections, I mentioned my reservations based on the fact that BP's defensive measures are not state of the art. The best defensive measurement systems use play by play data, and Baseball Prospectus does not . My feeling is that this would underestimate a team like the Astros, which surprised analysts in part because the team defense was so good.
I saw an interesting article by New York writer Tim Marchman on this very subject. He quotes BP to the effect that they chose not to use play by play data, because the data is not avaiable for minor league players, and their analyses try to keep the data uniform between the minors and majors.
As I understand the idea here, BP wants to make apples-to-apples comparisons between their minor league and major league defensive numbers, and so is artificially crippling the data set they're using to derive the major league numbers to bring it into line with the less granular data available for the minors. I see the appeal, but it makes the topline numbers suspect, especially when the system arrives at seemingly wonky results like Bobby Abreu rating as a plus defender and Hanley Ramirez as a Gold Glove candidate last year. Of course even very good systems have outliers, but not every system intentionally deals with a reduced set of data. For now I'll continue to rely on UZR and Plus/Minus, though I'll be curious to see what people like Tom Tango have to say about the technical pros and cons of the new system.
I understand the idea that some comparisons can't be made because play by play data isn't available. Methods like Total Zone were developed in order to make defensive comparisons across baseball eras. For example, with this method we find out that Adam Everett was a truly exceptional shortstop, and that Mark Belanger is one of the few shortstops from the 60's who can be compared to him. However, Total Zone clearly is inferior to play by play systems, like UZR, PMR, and Fielding Bible, because it relies upon drawing inferences and proxies which can be derived from retrosheet summary data. I should note that Total Zone is also used to derive defensive data on minor league players. It would be interesting to know how those ratings compare to BP's defensive system.
I suppose this goes to show that you can't do everything well. In the case of Baseball Prospectus, the minor league data part of their projections is so important for what they want to do that it trumps the accuracy of ML defensive data.