Tango on Baseball Archives

Diamond Mind Baseball - Gold Glove Winners (December 11, 2003)

Tippett goes through his process...
--posted by TangoTiger at 05:03 PM EDT

Posted 7:07 p.m., December 11, 2003 (#1) - taco tuesday
Jose Valentin. who would've ever believed? : )

Posted 7:21 p.m., December 11, 2003 (#2) - studes (homepage)
Same with Win Shares, by the way. Valentin was the top-rated shortstop in the AL.

Posted 12:33 a.m., December 12, 2003 (#3) - MGL
I have some respect for DM, but..

1) Their methods are apparently proprietary, which immediately takes them out of the realm of legitimate scientific research as far as I am concerned.

2) The description of their analyses on the above website is tainted with surplussage and fluff. To wit:

we look at range factors, which are assists and/or putouts per nine defensive innings, keeping in mind that range factors can be severely biased by the nature of a team's pitching staff: the left/right mix, strikeout rates, and tendency to generate ground balls versus fly balls

There is absolutely no reason to "look at range factors" if you have access to PBP data and use it for a zone based defensive metric, which they obviously do. That's like saying that "even though we have sophisticated metrics for evaluating offensive value and production, like lwts, OPS, runs created, baseruns, etc., we also look at BA."

That is pure unadulterated B.S.!

in cases where our findings are at odds with a player's reputation, we use the video clips on MLB.com to watch a large number of plays involving that fielder...

They can't be serious!

I take back what I said in the first sentence. I have absolutely no respect for them....

Posted 12:45 a.m., December 12, 2003 (#4) - MGL
To continue my excoriation of DM....

But it's hard to judge pitchers on only one season because they typically get dozens of chances to make plays, while other fielders get hundreds of opportunities.

If we extend our review of pitchers who convert a high percentage of chances into outs to include the last three years...

Unblelievably assinine statements! The gold gloves (both the official ones and the DM ones) are awarded each year. What possible relevance is a player's 3-year stats? When giving yearly awards, you don't care if a player's one-year performance was luck or skill. Sheesh!

Mussina was a good pick, in my view, because he was in the league's top tier in turning batted balls into outs, was third in the league in error-free chances, controlled the running game (only 9 steals in 19 attempts), and has done these things well enough in the past to show that this was not a fluke....

I didn't know that controlling the running game was part of a pitcher's qualifications for a gold glove, which is "fielding performance" as far as I know. Do they consider a position player's SB/CS totals for gold gloves? (OK, maybe that's a stretch.)

And again, whether a fielder's performance was a "fluke" or not should have absolutely no bearing on his golf glove qualificiations in any given year!

I am blown away by the poor quality of this article (as you can tell)...

Posted 1:02 a.m., December 12, 2003 (#5) - MGL
Two more criticisms, now that I've plunged Primer into italics hell....

"It's a classic question. Would you rather have a guy with great range but is somewhat error-prone or someone who's steadier but doesn't cover as much ground?"

That's a classic question? DM, did you forget that you were a sabermetric web site?

Since an error is "worth" almost exactly what a missed ball (base hit) is, you would rather have the guy that makes the most outs (per opportunity), period! That's not much of a question (for a sabermetrically knowledgable person)!

Finally, DM just gets done trashing A Jones' defense (and rightfully so, although I haven't done his 2003 UZR yet, but I expect it to be bad like 2002); then they go ahead a make him one of their Gold Glove recipients! Makes no sense to me!

Posted 2:15 a.m., December 12, 2003 (#6) - MGL

Posted 2:18 a.m., December 12, 2003 (#7) - MGL
Trying to get rid of italics. Not having much luck.

Posted 2:25 a.m., December 12, 2003 (#8) - AED
supposedly this works...

Posted 3:33 a.m., December 12, 2003 (#9) - MGL
I take back what I said about Jones. A preliminary 2003 UZR for A. Jones is:

+16 runs per 162 games (not including arm).

In the NL (CF only), BTW, Biggio was terrible, Pierre was only average, Lofton was still good, Finley was atrocious, and Kotsay was God (again)...

In the AL (CF only)...

Bernie once again was off the chart terrible (not even couning his bad arm), at -44 per 162! He has to be moved to DH! Baldelli was terrible, Hunter was good, bit not great once again, and Cameron was lights out again (one of the best overall players in baseball), with Wells, Beltran, and Damon all good as expected. Oh and Erstad, for his limited playing time, was once again great, even with all his injuries...

I'll publish the complete 2003 UZR's soon...

Posted 9:19 a.m., December 12, 2003 (#10) - Tangotiger
I am disappointed that someone as intelligent and knowledgeable as MGL makes an evaluation, that he doesn't provide some balance. It's not like DMB is trash, where finding balance would be hard. DMB may have the best sabermetric minds and system, bar none. I too would hold them to the highest standards, to the level of James and Palmer.

I found their presentation balanced, as they attempted to bring in the old minds, with talks of Range Factor and other non-PBP methods, and introducing their PBP methods as well.

We have to remember that when you talk about RF with lefty/righty splits, park and a few other 100% reliable data, you are trying to sell someone that thinks the PBP data, like grid location, and batted ball type/speed, is questionable.

Since no one has publicly shown the reliability of such PBP data (I want a correlation by stringers!), it's certainly a fair thing for DMB to do. Essentially, they could say that "hey! without considering grid/ball type/speed, Mike Cameron is great! And look, if we consider all the possibly questionable PBP data, Mike Cameron is great! So? Mike Cameron is great!"

***

As for SB/CS, if they are considered for catchers, they should most certainly be considered for pitchers. And heck, why not for the 2b/ss (though of course that would be 1% of what they'd do)?

***

As for multi-year data for pitchers, I think that's also fair. With the extremely limited data points, you can essentially pick 15 different pitchers, and you can make an argument for any of them that would be statistically significant. Because DMB does not track each pitcher's actual positioning, and the ball of the bat, and all that, they then decide to cut their losses, and look at prior years. I mean Greg Maddux and Mitch Williams may end up looking the same one year, by the stats by luck (ok, that's a stretch), but just looking at them for 10 plays would be enough to say that Greg would be a much better fielder.

***

Pinto had Andruw Jones near the top in CF rankings, as did DMB. I'm not surprised that UZR also has him near the top. This is more evidence of the silly talk about figuring out that someone is in a decline phase because you see his +20 to +0 ratings over a 4-year period. Now that he's +16, what are people going to say?

Posted 9:24 a.m., December 12, 2003 (#11) - Michael Humphreys
MGL,

Looking forward to the UZR ratings, which are the single most valuable piece of sabermetric work every year. The Andruw rating is close to Pinto's. Diamond Mind appraisals are worth reading (despite the Range Factor stuff), but one really wishes they would just provide a number. Or even a range of numbers based on the various criteria used. Do they even say why they won't provide numbers?

Posted 9:33 a.m., December 12, 2003 (#12) - dlf
MGL,

1. If you plunge into never ending italics or bold or another odd typeface, in the next thread, open the italics and then close it TWICE. (i) something (/i) (/i) obviously replacing the parenthesis with the less than and greater than signs.

2. You seem to suggest that PBP should completely replace any DRA, DFT, Def Win Shares analysis. I disagree. Unless and until we get to the point where all potential biases (park, pitching, opponents, defensive positioning, etc.) are removed, I believe they should complement one another. I understand that you attempt to adjust for most such biases in UZR, but until we KNOW that all are absolutely and correctly adjusted we should look at a multitude of data, PBP and non-PBP. That you continue to tinker with the method leads me to believe that we haven't gotten to that point yet.

3. Why shouldn't a pitcher's ability to control the running game be considered part of his fielding? It is for a catcher. I would hazard to guess that because of the number of potential runners versus the number of BIP in his zone of responsibility, of the non-pitching part of his defense, it could easily have the largest affect. That, in part, is why I've always thought the annual choice of Greg Maddux as Gold Glove recipient was incorrect. Maddux, for all his greatness, couldn't keep Mo Vaugh, while carrying David Wells and Calvin Pickering on his back, from stealing second.

4. You state "When giving yearly awards, you don't care if a player's one-year performance was luck or skill." I agree. However, I would suggest that using UZR to determine Gold Gloves could fall into that same trap. It is my understanding that many of the adjustments (park factors particularly) are based on multi-year data. And to determine whose talent is better, that is clearly the correct choice. But if trying to decide who was better retrospectively, I don't think it relevant if a park is playing abnormally in a particular year. (Unrelated to this point, but also a problem of using UZR to select GG is your practice of placing all players into the same context regarding number of opportunities. If the norm is 80%, Joe makes 9 of 10 plays, and Bob makes 17 of 20, Bob has helped his team more and in my opinion is more deserving of the GG regardless of a reasonable expectation that next year Joe will be better. That, obviously, is a question regarding the manner you present data rather than the underlying data itself.)

5. You seem to mock the idea of actual observation of performance. I disagree. I think that it is silly to say, as do many fans and sports writers, 'I saw Fred make great plays, therefore he must be a great fielder.' However, that is different that using systemic observation of many (all?) players for many (all?) plays. It is my understanding that PBP data puts all BIP into three categories, grounder, line drive and fly ball. It is my understanding that PBP data doesn't make indication for defensive positioning. If a fielder's reputation is out of line with PBP data, it is most likely that the reputation is wrong; however, that does not eliminate the possibility that the team has an odd positioning, an unusual number of borderline flys/liners or something else not picked up in the raw data. In analyzing the decision to take Pedro Martinez out of ALCS game 7, you closely reviewed (video? digital?) tapes of actual pitches to observe speed and location (and movement?). That is absolutely appropriate. I would suggest similar observation is relevant and appropriate to determining fielding prowess.

Posted 9:36 a.m., December 12, 2003 (#13) - dlf
Jeeze, half of what I just wrote, Tango posted a few minutes earlier and much more concisely. I need to learn to either type faster or hit refresh before I submit a comment.

Posted 10:02 a.m., December 12, 2003 (#14) - MGL
Hey, at least I stimulated some good discussion! Seriously, I was unfair to DM of course. Despite their shortcomings, of which we all are not immune, they are one of the best sabermetric resources out there, especially for defense. Of course, we all wish that they would provide more insight into their exact methodologies, as well as provide a "number" for defense, but you can't fault someone for trying to make a buck, and I guess that is one of their market strategies.

I'll have some more comments later. Got to go to a funeral (ex-mother-in-law passed away)....

Posted 11:19 a.m., December 12, 2003 (#15) - Tangotiger
dlf: I very much enjoyed reading your comments, so please don't think that they could have been supplanted by my comments. I thought they were very well-written.

Posted 11:21 a.m., December 12, 2003 (#16) - Anonymous
.

Posted 11:43 a.m., December 12, 2003 (#17) - J Cross
btw, when asked Billy Beane said that the best defenders in baseball were Chavez, Cameron and Mientkiewicz. Seems like DM backs that up. No surprise, to me, to see Beane going after Cameron. Is Mientkiewicz's fielding enough to make him a valuable player? Do any metrics take into account scooping bad throws for first basemen? Any idea how important that is?

MGL, looking forward to those 2003 UZR's.

Is it surprising that Cameron is still great in CF as he gets older? Any chances fielding goes first and can be used as an added predictor for offense? I've heard that some new research suggests that the sense of smell is the first thing to go as people dement and that they might be able to tell who is heading in that direction by testing sense of smell. Do you think they'll ever find something like this for baseball?

Posted 11:48 a.m., December 12, 2003 (#18) - J Cross
I mean, just as a for instance, Edgar Martinez's defense fell off and then.... nevermind.

Posted 12:40 p.m., December 12, 2003 (#19) - ColinM
"When giving yearly awards, you don't care if a player's one-year performance was luck or skill."

I want to comment on this statement, which dlf also touched on. While I agree with this in principle, I don't think it can work in practice, at least when it comes to handing out awards for fielding. It's not that I want to give credit to a player for performance in other years, it's just that I don't think the existing fielding metrics provide enough of a confidence level to acurately judge what a players performance acually was in a single season.

Look at batting performance. You have a solid baseline to comapare to, the league average. Whether you want to adjust this to a replacement level is a matter of taste, but at least you have something tangible to compare to, a level where you can be reasonably confident the "average" player would perform at. You want to adjust for park factors of course, but you can still be pretty sure its close to the right value.

Fielding is different. Even the best methods like UZR (which I'm a bit in awe of) make a ton of adjustments and assumptions in order to estimate an "average" baseline. The confidence level just isn't there that the average it comes up with is "right", at least compared to batting or pitching. So it only makes sense to look at multi-year data in order to add extra information, to increase your confidence that the average baseline you're using is correct.

Posted 2:19 p.m., December 12, 2003 (#20) - dlf
Tango,

Thanks!

MGL,

My sympathies for your loss. While I anxiously look forward to your response to the issues raised by Tango, J.Cross, ColinM and myself, I can appreciate a slight change in priorities at the time of a family member's death.

Posted 3:28 p.m., December 12, 2003 (#21) - J Cross
MGL, I'm sorry to hear that. I didn't read to the end of your post before and didn't mean to just ignore it.

Posted 7:47 p.m., December 12, 2003 (#22) - MGL
Thanks guys.

Anyway...

A few more comments...

Since UZR's at first base do not have large range of values, it is reasonable to think that scooping bad throws is a failry big part of defensive ability at first. Snow is considered particularly adept at this (hard to say whether it is true of course). His UZR ratings have not been that good (a little above average this year), but he may be a much better first baseman overall because of his scooping ability.

It pisses me off that the "stringers," the people who score the games for STATS and other companies that provide PBP data, don't record "bad throws". It's not the stringers' fault of course. There is a lot of valuable stuff that is left out of the PBP data, but that is starting to change. Eventually, everything will be taken off a video by a computer I imagine.

I am working on a way to estimate a first baseman's scooping ability by simply comparing the errors made by 2b, SS, and 3b with that player and first and the same 2b, SS, and 3b with another player at first, with the appropriate weightigs. Actually, it is not as simple as it sounds, as you have to deal with all kinds of tricky sample size issues.

As far as whether one should use prior years' data for these types of awards, I completely disagree with the sentiment here. Although I normally have little interest in performance type metrics (I usually want to know true talent and future performance), I think that an award such as a gold glove should have absolutely nothing to do with any other year but the one in question, nor should you have to use data from any other year. I understand the arguments made herein about using past years to establishing baselines and to increase the confidence level of one particular year's evaluation, but the bottom line is that other years should have very little (almost none) relevance or influence on how much "value" a player added or subtracted to his team as compared to another player in the same year, nor should we care at all about that player's true value (real talent, future perforemance, etc.).

As far as including a pitcher's ability to control the running game in the Gold Glove award, one, I would have to look at the guidelines before I knew for sure what my stance was on that. Clearly it is "defense" (preventing runs) but clearly it is NOT fielding. The fact that they call it a Gold "Glove" suggests that they are interested in fielding only and NOT every aspect of defense. If that were the case, why would you not include a pitcher's pitching in his Gold Glove qualifications? IOW, can you say for sure whether holding runners is part of pitching or part of fielding or an entity all to itself? I don't think it is clear cut (in fact I know it is not). Obviously a catcher's SB/CS totals are a lot closer to being part of his "fielding" than it is for a pitcher, since he has to actually throw the ball. I don't think that the fact that it is included in a catcher's GG qauls makes it anywhere near a no-brainer for pitchers. In fact, if traditionally managers do not take that into consideration for ptichers, then you can make a reasonable argument that it shouldn't be considered, as sometimes tradition creates the rules, when the rules are incomplete or ambiguous.

Yes, defense does generally decline before overall offense, J Cross. The evidence is that fielding is like triples and that it peaks very early for obvious reasons (speed and agility rather than learning or experience are the major components, at least at the major league level). As far as whether a "declining" fielding metric suggests an impending decline in offense, on the average, I have no idea. I doubt there is much connection, although there is probably a decline in triples, SB, and bunts and infield hits contemporaneous with a decline in defense, but I don't think that one "follows" or "signals" the other.

More importantly, however, is the fact that it is almost impossible to tell whether a player has started to decline or not (or when he peaks, etc.), as there is just too much noise. Anecdotal examples are the A. Jones thing and Jeter in hitting, although I'm sure there are many examples...

Some more 03 UZR "previews" (ones that I thought were interesting without necessarily mentioning any numbers): These are preliminary because I haven't put in any changes yet into the UZR methodology that I plan to, based on some great discussions and suggestions on Primer and Fanhome, and I haven't updated the park factors yet (I am using last year's which should be around the same for all parks but Cin I think...

R. Alomar, not surprsingly, was terrible. Time to retire real soon. Although we should expect to see some rebound offensively and defensively, I would not want him on my team at any price.

Bagwell still good even though offensive has declined a lot with age. My aging and defense research last year indicated that defense at first base, unlike any other position, may improve with age into a player's 30's like some offensive components. That is not too surprising, I don't think.

D. Bell is still great at third and unheralded I think.

Berkman is surpisingly decent.

As I said Biggio did not make the adjustment to CF real well, but putting someone in CF (a young person's position) at his age was not a good idea.

A. Boone still good on defense, which makes his overall value pretty good despite mediocre offensive numbers.

Amazingly, Bordick still one of the best at any IF position, still largely because of his great hands!

S. Casey, tremendous! L. Castillo not so good.

Both Chavez' (Endy and Eric) very good.

Hee Seop no good.

Jose Cruz great!

Counsell's defense keeping him in the game, despite the ugliest batting stance in the history of baseball.

J.D. Drew no good anymore or fluke?

Eckstein and A. Everett still great! Carl Everett, OTOH, yuk!

Furcal very good, very good overall player. One of the best.

Giambi's days at first should be numbered. Yanks defense is in big trouble, other than Boone at third (and Soriano)! Here are their approx. 2003 UZR's at each position:

First Base: -21 (N. Johnson was no good either)
Second base: +7
SS -26
3B +13
LF -4
CF -40 (-33 Bernie)
RF -3

That's is a total of -74, almost a half run per game! Think how much better their pitchers are than what their ERA's suggest!

S. Green still terrilbe. Not much value overall any more.

Grissom still no good. Despite great and surprising offense in 2003, I hate him as a player.

Vlad no good in 2003. Is that a fluke (before everyone invents reasons for his decline, think Andruw)?

C. Guzman still terrible! Why does this guy still have a job?

Hatteberg, NG!

Izturis average - very surprising. Fluke?

All the Lee's were great - Travis, Derek, and Carlos! Travis was the best.

As I said, Lofton can still chase 'em down. His reputation for "losing it" came from a few awkward plays in the '02 WS, I think.

Lowell was way better this year. Maybe he's not as bad I previously thought.

Tino still good. Should add a half win to the D-Rays to bring their projected win total up to around 51 games.

Mientk.. excellent!

Olerud still pretty good.

Mags was great in right. Also one of the most underrated players in baseball.

Neifi Perez back to being very good. I don't know what is up with his fielding other than he has fluctuated wildly from year to year. Maybe just extreme random variation. Someone has to have that.

Polanco an unheralded great defensive player.

Pujols can field pretty well too!

I almost forgot. Bonds was not nearly as bad as I anticipated (a little below average).

Manny was terrible!

A-Rod and Nomar both above average again (A-Rod better). One of these guys is better overall than Jeter and the other is a jillion times better overall than Jeter.

Rolen, OK, but nowhere near as good as in previous years (injuries?).

Rey Sanchez not really pickin' 'em anymore. If he can't, that should be end of career - oops, I forgot about his veteran leadership value!

Ichiro finally has good UZR numbers! Maybe his last few years low UZR's were a fluke.

Tejada below average, but not by much.

Thome was horrible! I can't think of his previous years' UZR's (good, bad?) off the top of my head.

Jose Valentin fantastic! Another terrific and underrated player I think.

Vina was terrible in limited play.

Ty Wigginton was atrocious. What's his reputation on defense?

Womack wasn't too bad this year. Still can't hit of course. He and Neifi are interchangeable to me and should be bat boys for some team, not bat ers.

As I said, Cameron was light out again. Easily the best defender in baseball year after year.

Keep in mind that the above comments are heavily and unfarily biased towards one-year samples, which are hardly necessarily representative of aplayer's true defensive ability. to get a much more relible snapshot of a player's true defensive value, you should look at their multi-year (3?) UZR's combined (perhaps weighted). If you don't look at them combined you will be tempted to make unreliable inferences about their ascent or decline...

Posted 12:02 p.m., December 15, 2003 (#23) - bob mong(e-mail) (homepage)
Thanks for the sneak peak, MGL! Fun to read and think about.

I thought this was interesting (from the article): "If Dan Wilson (92 starts) didn't share the position with Ben Davis, he'd get my vote. He was part of the duo that led the league in fewest steals allowed, he led the league in fielding percentage (only one error), and shared the lead in fewest passed balls allowed among catchers with at least 800 innings. But it's hard to pick a guy who caught only 57% of his team's innings, so I'll concur with the voters and give the nod to Molina."

I thought this was interesting because the guys at U.S.S. Mariner have often said that Dan Wilson is very overrated defensively. So I emailed them and asked them what they thought of Tippett's comment, and Derek emailed me back and said (paraphrase) that since Dan Wilson caught Moyer almost exclusively last year, that throws off the SB/CS numbers (and possibly the other numbers, too), since Moyer is left-handed, controls the running game very well, and also has good control and therefore doesn't throw a lot of balls in the dirt or past the catcher. What do you guys think? MGL, what does UZR say about Wilson? Does catching most of Moyer's innings skew Wilson's numbers up (and, Ben Davis' numbers down)?

Posted 1:16 p.m., December 15, 2003 (#24) - Tangotiger
Since Tippett goes through a process similar that I do with "Evaluating Catchers", I find it a little hard to believe that Tippett would have let something like that slip by. When I get the 2003 PBP data, I'll break down the Mariner pitchers and catchers for 1999-2003.

Posted 3:30 p.m., December 15, 2003 (#25) - MGL
Bob, there is no doubt that a catcher's nubmers (SB/CS and PB) should be ajdusted for the pitchers they catch, especially a part-time catcher who might catch a particular pitcher every time he pitches. Whether Tippett did that or not. I have no idea.

UZR does not rate catchers. In Super-lwts there is a category for catcher lwts which uses a catcher's SB/CS, PB, and error rates to come up with a runs saved or cost as compared to an average catcher given the same # of opportunities. Unfortunately, I do not control for the pitchers either. I should, and maybe I will from now on.

Jamie Moyer does indeed have one of the lowest WP rates in baseball so yes, you would expect his catchers' PB rates to be somewhat low as well (I think - I don't know how much correaltion there is - PB's are supposed to be somewhat independent of the pitcher - i.e. a PB is not supposed to be a pitch in the first or other difficult or "wild" pitch). And of course, any catcher who catches a left pitcher is necessarily going to helped considerably with his SB/CS numbers. OTOH, it is sometimes difficult to tell "which came first" - the catcher's SB/CS rate or the pitcher's, especially since I don't know of anyone who has made a reasonable estimate of each player's (pitcher or catcher) percent of "responsibility" in the SB/CS attempt and success rates.

Looking at Moyer's 2000-2002 SB/CS numbers and using that as a proxt for how HE controls the running game, i.e., how much HE helps Wilson, he allowed 34 SB in 56 attempts, which looks like about an averge number and rate (61%) for a LHP. Compare that to someone like Maddux, who in 3 years has allowed 74 SB's in 101 attempts (73% success rate). OTOH, SEA RHP's, such as Baldwin (24/46), Garcia (36/42), and Piniero (36/50) had similar attempt and success rates.

Looking at my Superlwts from 2000-2002 (I don't have the 2003 ones yet), and using the methodology described above for "catcher dfefense," I have Wilson at only +5 runs, Molina at +16 (compare Piazza at -27), so even with possibly getting "help" from Moyer, Wilson was not all that great for those 3 years...

Posted 3:44 p.m., December 15, 2003 (#26) - MGL
proxt=proxy
pitch in the first=pitch inthe dirt
left pitcher=lefy pitch
my typing = sucky