Given the recent spirited debates regarding Adam Dunn's defensive value to the Nats (see here and here) I thought it would be a good time to review Ultimate Zone Rating (UZR). What are it's strengths and weaknesses, and how should we use it? I know these posts are hard to find after a week or two, so I've also posted this as a permanent page at Natstats here.
Ultimate Zone Rating
From baseball’s earliest days, managers, players, reporters and fans have all tried to answer a seemingly simple question: What value does a player contribute to his team defensively? For the first 100 or so years, we relied on nothing more than fielding percentage – (balls played – errors)/(balls played). We all understand that this isn’t a good measure of value. A 50 year old man who plays third base, only fields the slowest of ground balls, makes a lazy but accurate lollipop throw to first, has a fielding percentage of 1.000. Ryan Zimmerman, who dives for balls in the camera well and makes throws from the left field tarp, will have a less than perfect fielding percentage. Still, every team in baseball would be happy to have Zimmerman at third.
Bill James tried to improve on the fielding percentage stat when he created Range Factor (put outs + assists)/(games played). This was a better stat than fielding percentage, but still had room for improvement. In 2003 Mitchel Lichtman (or MGL) introduced a new stat at the Baseball Think Factory called Ultimate Zone Rating (UZR). MGL tried to account for external factors such as variance in pitching, variances in ball-parks, and luck. He also converted from a games played stat to an innings played stat. In theory, UZR measures how well a player converts a batted ball into an out. This is a positional stat. You can’t compare the UZR of a right fielder to the UZR of a shortstop (the old apples and oranges thing).
How Is UZR Computed?
The field is segmented into 78 zones – 64 of which are used in UZR calculations. Every play is entered into a huge database with items such as the zone number where the ball landed, type of hit (Ground Ball, Fly Ball, Line Drive, Pop Up), etc. Here’s a chart of the zone:
To adjust for ball park effects, outfield foul balls are ignored. Also, infield line drives, which are more the result of positioning than skill, are ignored, as are infield pop-flies. Pitchers and catchers are not included in UZR.
A Little Bit of Math
After every play is entered, we start with the math. Algorithms are run for every zone, determining the number of balls hit, the type of hit, what percentage of time the ball was fielded for an out, what percentage of time the ball was fielded by position etc. For example, consider a zone between shortstop and third base. For simplicity sake, say there were 250 balls that landed in the zone. 50 landed for hits, 150 were fielded by the third baseman for outs, and 50 were fielded by the shortstop for outs. The MLB expected average would be computed for that zone, and stored in an expectancy matrix. Then, each player is compared for his position against the matrix. Now, our 50 year old man who records 10 outs in 250 chances in that zone is compared against the expectancy matrix of 150 outs, and receives a -140 for that zone. Zimmerman, who might record 190 outs in that zone, gets a +40. These computations are made for each zone of responsibility on a positional basis (the apples and oranges thing again), to create each player’s UZR. The UZR/150 you see on Fangraphs also makes adjustments for handedness of the pitcher and batter, the game state (number of outs/runner position will determine where a throw is made), double plays turned, batted ball speed, and errors made. The 150 on UZR/150 means that Fangraphs has normalized UZR calculations so that all players are compared over a 150 game season (more on this later).
How Reliable is UZR?
How Many Games Do We Need?
Tom Tango, co-author of Inside the Book, believes that 200 Plate Appearances (PA) equals 400 Balls in Play (BIP). He has also found that different defensive positions receive a different number of chances in a game. His research shows that SS and 2B get on average 5 BIP per 9 innings, 3B and CF get 4 BIP/9 Innings, and LF, RF, and 1B get 3 BIP per nine innings. Think about that. If Adam Dunn plays 150 games at 1B for the Nats this year, he will only see 450 BIP, or the equivalent of 225 PA. We would never judge a player’s offensive abilities on 225 PAs. We shouldn’t judge a player’s defensive abilities on 450 BIP. In reality, our defensive statistics sample size doesn’t reach critical mass until roughly 3 seasons of data have been entered.
We talked about how Fangraphs normalizes UZR data to a 150 game season. Now that we know we need 3 full seasons (or 450 games) worth of defensive stats to have a reliable sample size, we can see how unreliable this stat really is. If a player has played a half season (75 games) at a position, this is only 1/6th of the data we need for reliable analysis. Extrapolating these 75 games to 150 is no different from extrapolating 10 coin flips to a million. We can create a fancy formula to come up with a number, but there isn’t enough data to make the number meaningful.
What To Do?
Much like we use the slash stats (AVG/OBP/SLG) in tandem to get a more complete look at a player’s offensive abilities, there are multiple stats that try to measure a player’s defensive ability. In addition to fielding percentage, zone rating and UZR, John Dewan devised a stat called Plus/Minus. (For more information, go here). Plus/Minus breaks the field into zones, and is very similar to UZR. One of the biggest improvements is applied to the 1B position. UZR does not account for a 1B holding a runner. So, teams whose pitchers have a higher number of base runners have 1B susceptible to a lower UZR. Plus/Minus corrects that omission by adding "runner on 1st" as one of the game state adjustments. Of course, we still need 3 years of stats for Plus/Minus to achieve the desired level of confidence.
The bottom line is this – none of these stats paint a perfect picture of a player’s true defensive value. UZR and Plus/Minus are better than fielding percentage. Maybe we should start a new defensive slash stat called Fielding Percentage/UZR/Plus-Minus?