Tango on Baseball Archives

03 MLE's - MGL (December 28, 2003)

Lots of good stuff in there. MGL does his projections for top minor league players. I'll bet you can start a team, RIGHT NOW, and finish above .400, if you are given access to any team's players 26 and older from the minors.
--posted by TangoTiger at 04:14 PM EDT

Posted 5:04 p.m., December 28, 2003 (#1) - Scoriano
http://yankees.theinsiders.com/2/216981.html?refid=1

Myrow is of course behind the former QB from Michigan and is reportedly a terrible fielder in the accounts I have read.

Posted 5:37 p.m., December 28, 2003 (#2) - MGL
It is fascinating to read the scouting reports on players in the minors and to look at their MLE's as well. A few thoughts:

How bad do you have to be on defense not to play in the majors if you are a very good hitter? Not only do we know that errors are a small part of defense overall, shouldn't we at least look at the average error rates in the MINORS at the variuous positions? Why is this scouting report comparing his error rate to that of major league players? Are error rates very dependent on a minor league player's home stadium (are some of them "cornfields"?) What about Myrow's range? What does the age or experience curve for error rate look like?

Why is he "running out of time" at age 27, when his MLE's say that he is more than ready to hit at the major league level? What are they waiting for? Obviously the Yankees have very little room for offensive players at the mjaor league level. It would seem that Myrow being still in the minors is not his fault at all? Is there any evidence that a player with a good MLE over many PA's (1000+) needs "seasoning" in the minors for some reason? Is there any evidence that a 23 yo with the same MLE projection (the projection including the appropriate regression accounting for age) as a 27 yo will do any better or worse in the majors, other than the 23 yo is likely to to get better and the 27 yo is likely to get worse because of age?

Drew Henson? His MLE in 2003 was -21 in 500+ PA's and -13 in 2002 in 500+ PA's. In 2001, it was -33 in over 300 PA's. Seems to me the guy can't hit, period. I wouldn't waste ANY time waiting for him! In fact, after 2001, and even more so after 2002, I would not have wasted my time either.

If you use a player's MLE, isn't it a great opportunity to unload (trade) a player with a lousy hitting projection (especially in a hitter's park, where his raw minor stats might actually look good)but a great reputation, like Henson, and vice versa (pick up an unknown player/prospect with a great MLE)? Shouldn't the Yankees be able to trade Seguignol to some team that desparately needs a good hitting first baseman, as they don't need him as long as they have Bernie and Giambi, and even when Bernie is gone, Seguignol is likely to be too old?

What teams do you think use MLE's to evaluate minor leaguers? I don't see how you can evaluate them any other way! At the very least, shouldn't all teams have a list of every minor league player's MLE in front of them at all times?

And finally, I think Tango is going crazy! Why is UZR in the title to this thread? I think he means "LWTS"! He is starting to type like me (20 WPM and 15 typos per minute)!

Posted 6:28 p.m., December 28, 2003 (#3) - Virgil
MGL,

I know you're open with most of your work - I was curious to know how you calculated your MLEs?

Thanks

Posted 7:10 p.m., December 28, 2003 (#4) - MGL
Mr. Tibbs,

No problem. Quite simple actually.

First, as I said, I park and league adjust the raw stats. I use home/road splits to park adjust the stats. I use 5 year component PF's for s,d, t,hr, bb, and so and regress the 5-year sample PF's to create "true" PF's. The league adjustments are done simply by using the ratios of the player's league to the total league for each of the categories (s,d,t,hr,etc.). When I say "league," I mean INT and PCL for AAA and SOU, EAS, and TEX for AA. No regression there. AA and AA have separate MLE coefficients, so no league adjustment is needed for AA or AAA as a whole.

Anyway, once the raw stats are adjusted for park and league, I simply multily them by the following MLE coefficients. These are the "best fit" coefficients (on average - not using a regression analysis) I use:

s=.95
d=.85
t=.95
hr=.61
bb=.87
so=1.15
sb=.9
cs=.9

AAA

s=.98
d=.87
t=.93
hr=.68
bb=.9
so=1.10
sb=.85
cs=1.05

Then I simply use a standard Palmer type lwts formula (including SB and CS), where all the components are set at the last 3 year's major league levels. IOW, let's say a minor league (AAA) player has a normalized (where 1.00 is average in AA or AAA) HR rate of 1.50. After multiplying that 1.50 by the MLE coefficient of .68, we get 1.02, so now our player is expected to hit 1.02 times the average major league HR rate, since he hit 1.5 times the average AAA HR rate, and players who play in AAA and the majors in the same year hit 68% of their AAA HR rate when in the majors (that's where the MLE coefficients come from) over the last 3 years. If the average HR rate in the majors were 14 per 500 PA over the last 3 years, then I use a HR rate of 14 times 1.02, or 14.28, per 500 PA for this player to compute his MLE lwts.

That's it!

Posted 7:26 p.m., December 28, 2003 (#5) - MGL
I also finished doing the pitcher MLE's. From those I can compute MLE ERC (component ERA). I include a pitcher's minor league WP rate in his component ERA, BTW. I also use a player's actual home/road splits to do the park adjustments, and I also use a pitcher's actual Sngls, Dbls and TRpls rates, and not just non-HR hits.

Here are the best MLE ERC's for 2003, min 400 TBF's:

Name, age, team, hand, St/Rel, ERC

J. Brown, 27, BUF, R, S, 448, 2.42
Cotts, BIR, 24, L, S, 440, 2.70
R. Beltran, OTT, 35, L, S, 412, 3.14
C. Reyes, DUR, 35, R, S, 522, 3.24
Tsao, 23, R, S, 446, 3.35
Wasdin, 32, NVL, R, S, 553, 3.39
Griffiths, 26, NOR, R, S, 459, 3.45

The ERC's are based on an average major league ERC of 4.00 by definition.

Only Brown and Griffiths pitched in AA or AAA in 2002 (I think). Brown's MLW ERC in 2002 was 4.35 on 419 TBF, and Griffiths' was 6.04 in 646 TBF. The average MLE ERC in AA and AAA combined is 5.79, based on an average major league ERC of 4.00.

That is Jamie Brown, BTW, Anyone know who he is? Seems like he should be pitching in the majors, based on these numbers.

Posted 8:10 p.m., December 28, 2003 (#6) - Scoriano
Is there any study on the predictive power of MLE's for older prospects. IIRC, are there relatively many successful late bloomers, that is to say, hitters that first achieved great success in the majors at age 27 or 28 or later. I seem to recall Neyer discounting the probability of Shane Spencer becoming a successful Major League power hitter on a similar basis.

As for Seguignol, I observed his limited sample size for the Yankees last year and was not impressed but that can be true of anyone in so few at bats. Nonetheless, I suspect that he is what the scouts might call a career AAA hitter. I don't know if that is fair, but when a player is that old, you might want more than MLE's and get the scouting reports, too. I think you'd always want them if you could find good scouts but one's MMV. In any event, I think he is likely to get a fair shot in spring training with the departure of Nick Johnson.

Posted 8:10 p.m., December 28, 2003 (#7) - Patriot
Jamie Brown is a soft tosser from the Indians organization. He's a guy who never gets mentioned when they talk about our potential pitching prospects, and I've always wondered why. I didn't realize he had such good minor league numbers when adjusted for context, but I did know they were better than Paul Rigdon and Jason Phillips and other guys who the Indians have given chances.

Posted 8:36 p.m., December 28, 2003 (#8) - Rally Monkey
Jamie Brown pitched part of the year for Buffalo and finished with Pawtucket, making 13 starts in 31 total games. His record was 8-5, 2.95 ERA, 113 IP, 85 H, 22 BB, 65 K. He'll be 27 next year. In 2002 he pitched in AA Akron, 9-5, 2.78, 104-98-17-72.

John Sickels didn't mention him in his prospect book last year. His strikeout rates are a little low for my tastes.

Posted 9:50 p.m., December 28, 2003 (#9) - MGL
Here a couple more that were not on my previous list:

Edgar Gonzales, 21, ELP, R, S, 697, 3.45
Telemaco, 30, SWB, R, S, 599, 3.48

As far as Brown's K's, we hear it all the time, but I don't recall ever seeing an actual study that suggests that a pitcher's K rate is a good independent predictor of future succss. If anyone knows of one, point me in the right direction. Sure, if a pitcher has a low K rate AND his $H rate is low, then his ERA or ERC may overestimate his projection. But if his $H rate is average or worse or you use a DIPS ERA or something like it (where you regress each component differently before combining them to form a regressed ERC), then does it matter what that K rate is? I don't think so, but Icould be wrong.

IOW if pitcher A has a 4.00 ERC or ERA in 1000 TBF with an average $B rate and a K rate of 5 per 9 innings, and pitcher B has the same ERC or ERA in 1000 TBF, also with an average $H rate, but his K rate is 8 pe 9 innings, is there any evidence to suggest that pitcher B's overall projection is going to be better than pitcher's A? Again, I don't think so, or at least I've never seen any good evidence to indicate as such. It is not difficult to test that assertion. We hear all the time about how important pitcher's K rate is, but that is becuase if a pitcher's K rate is low, in order to be as good or better than another pitcher whose K rate is higher he will have to excel in either his BB rate of his HR rate, which is difficult, but certainly not impossible.

Obviosly if we have 2 pitchers wirth the same ERA or ERC in the samne period of time, we'll take the one with the higher K rate, as the other one is more likely to have gotten lucky with his ERA (low $B). But again, if we have two pitchers with equal ERA's or ERC's and one has a higher K rate, but they both have the same $B, I don't think it matters which one you take.

So before we say that we don't "like" Brown's K rate, don't we have to look at his $H. If it is not low, suggesting good luck, then his K rate is no problem, and his ERC over the last 2 years is a decent indication of how will pitch in the future, even after regressing towards the average ERA of a rookie pitcher. In fact, if we take Brown's weighted ERC average from 2002 and 2003 over those 850 TBF's or so, we get 3.25 with a weighting of 4/3. If we regress that even 60% towards the mean of a rookie pitcher, say, 4.5 (again, where 4.00 is defined as major league average), we get a projected normalized (to 4.00) ERA in the majors of 4.00, which is a hell of a lot better than a replacement pitcher. I'll take him any day of the week!

Posted 10:39 p.m., December 28, 2003 (#10) - MGL
I'll bet you can start a team, RIGHT NOW, and finish above .400, if you are given access to any team's players 26 and older from the minors.

BTW, that's a fascinating thought experiment! What would your average MLB team record be if you fired all of your scouts, fired your entire roster and just used the top 25 guys in MLE lwts (accounting for defensive position of course) and MLE ERC??

What about for a team like Det and TB? What if you did the same but combined your major and minor league rosters?

In fact, here are the top 10 position players in the DET organization (Toledo and Eerie) and the top 5 starting pitchers. How many games would this team project to win at the major league level?

Offense

Name, lwts per 500 PA (based on 2003 stats) regressed 50% towards -14 (min 300 PA in 2003), age

C Inge, -11, 27
1B Daigle, -16, 25
2B Tousa, -15, 25
3B Ust, -17, 26
SS Bautista, -15, 26
OF Nicholson, -14, 28
OF Varner, -14, 24
OF Walker, -16, 27

Those are their real defensive positions.

Pitching

Name, 2003 ERC regressed 75% toward 5.00, where 4.00 is major league average, age

Ahearne, 4.67, 35
Loux, 4.75, 25
N. Robertson, 4.90, 27
Henkel, 4.92, 26
M. Johnson, 5.03, 29

OK, so we have a total of -118 lwts runs in offense in 500 PA. 162 games is around 1.3 times that amount or -156 lwts runs for the season. If the average rpg is 5.0 or 810 for the season, we have these guys at 654 runs for the season.

The pitching is really not that bad. If we just average the ERC's of the above 5 pitchers, we get 4.85 or .85 above average. Since each game is around 8.75 innings, you have 157 9 inning games in an 162 game season, so out pitching staff will allow 157 times .85 or 134 runs more than average, or 944 runs.

So we will score 654 and allow 944. Using a pythag exponent of 2, that is a pythag w/l record of .324, or 52.5 wins and 109.5 losses or almost 10 wins better than their major league record this year, and almost 6 wins better than their 2003 pythag record. How about that! How much would the above team cost?

Posted 10:42 p.m., December 28, 2003 (#11) - MGL
The 4.85 normalized MLE ERC for our pitching staff is after regressing the 50-man average 75% towards 5.00.

Posted 11:33 p.m., December 28, 2003 (#12) - MGL
5-man, not 50-man...

Posted 12:52 a.m., December 29, 2003 (#13) - Charles Saeger(e-mail)
MGL -- do you keep batting outs or at bats as the constant, and do you adjust for times on first base before adjusting SB and CS?

Posted 3:11 a.m., December 29, 2003 (#14) - MGL
It's crude, I admit. I just keep the PA's constant, and adjust the rates of the components using the above MLE coefficients. For example, if a player has 20 HR's per 500 PA in AAA, he gets 20 times .68 (the MLE HR coefficient for AAA) or 13.6 HR's per 500 PA in the majors (that is his HR MLE). So the outs get adjusted automatically as they are whatever is left over after adjusting the MLE s,d,t,hr,and bb+hp.

If I want to compute an OBP or SA or BA from the s,d,t,hr,and bb+hp rates per 500 PA, I figure around 5 total SF's and CI's, or something like that (and ignore SH's and IBB's, as I do for major league stats). For SB and CS I just use the minor league raw numbers, again, per PA, and not per times on first, and then multiply those by the MLE coefficients. Not some of my most rigorous work, but it does the trick.

Posted 8:29 a.m., December 29, 2003 (#15) - David Smyth
==="..I don't recall ever seeing an actual study that suggests that a pitcher's K rate is a good independent predictor of future succss. If anyone knows of one, point me in the right direction."

Well, it depends on what you mean by 'future success'. If you mean just next season, that's one thing. But if you are looking more at longer-term development, then Ks are supposedly a huge consideration. In the 1982 Abstract B James had quite a few studies in an article called "Looking for the Prime". Among the results for pitchers:

1) In a matched pairs study of young power pitchers vs young finesse pitchers, holding age and wins constant, the power pitchers won 30% more games in the rest of their careers.
2) The power pitchers who won the Cy Young perfomed markedly better in the following season than did the finesse pitchers who got a Cy.
3) The power pitchers who won the Sporting News Rookie of the year won almost twice as many games over the rest of their careers than the finesse pitchers who won that award.
4) If the lg leader in ERA was a power pitcher, he had a far greater tendency to repeat on the list in any following season then if finesse.
5) Of the pitchers with the finest careers, almost all were power pitchers in their early years.
6) His main study in this area--coding pitchers as power or finesse, good or bad control, winning or losing record, high or low ERA, and right or left-handed, he compared the expected remaining career wins vs the actual. The power pitchers noticeably exceed their estimate, and vice versa for the finesse. This finding applied to every single combination of factors, such as any selected age, ERA, W/L, control profile groups.

This doesn't have all that much to do with exactly what MGL is talking about. I thought it would be good to post this because many people have not seen that 1982 Abstract, and this article was one of James' all-time best, IMO. But if you have 2 pitchers of the same age, with the same ERA, ERC, and DIPS ERAs--the one with the higher K rate has a greater career expectation. Maybe the effect for just the following season only is so slight as to be ignorable, but I doubt it.

Posted 9:33 a.m., December 29, 2003 (#16) - tangotiger
Mickey: you have the MLB ERA average (or is that RA) as 4.00, and you regress your pitchers towards 5.00. Then, you state that your hitters have a league average 5.00 RPG. I think it gets confusing, and you should settle on a common baseline for both hitters and pitchers, and for pitchers, use RA and not ERA.

Posted 12:54 p.m., December 29, 2003 (#17) - MGL
DS, yes I am vaguely familiar with some research on long-term results of high K and low K pitchers. But...

But if you have 2 pitchers of the same age, with the same ERA, ERC, and DIPS ERAs--the one with the higher K rate has a greater career expectation.

I do not think that BJ or anyone else controlled for DIPS ERA! That is the problem! Two pitchers equal in regular ERA, one with high K (power) and the other with low K (finesse), the power pitchers will tend to better at all times in the future because their DIPS ERA will be different! The high low K pitchers will tend to have a lower $H rate (luckier). IOW, their projections will NOT be the same! You have to take the two groups and control for a modified DIPS ERA and THEN look at future (short and long-term results) results! I dont think this was done, as no one knew or at least talked about the fact that different components should get regressed differntly in calculating a pitcher's projected ERA from his sample ERA!

Even given that, I have no doubt that a pitcher who is a "power" pitcher may have a "longer" career because they tend to be bigger and stronger. There is some selective sampling problem though if you study length or career as a function of K rate. As a pitcher gets older and starts to suck, teams will tend to keep the power pitchers (e.g., B. Witt, Helling), and let them continue pitching, and not the finesse pitchers, even for the same level of suckiness.

Tango, I did it right. I just used confusing terminology. The pitcher numbers are all "normalized" to 4.00 as average. The 4.00 doesn't mean anything. I then regressed their sample ERC's to 1 run higher than a league average pitcher, somewhat arbitrarily. It maybe should be a little higher as the average minor league pitcher has an MLE ERC of almost 2 runs higher than an average major league pitcher.

For the hitters I just took their sample MLE lwts below league average and regressed them to -14 per 500 PA, which is also around the average MLE of all minor league hitters. The 5 rpg is just a number I used to do the pythag w/l record. It could have been 4.5 or it could have been 5.5, although 5.0 is about an average AL rpg over the last 3 years. What did you think of my estimations for an all Detroit minor league team? Do you think it is reasonable? I honestly do. I agree that the average "best of" minor league team would be at least .400 in the majors, which is amazing if you think of it. Also, what do you think the average "MLE" UZR of a minor leaguer is? Given their young ages, I would have to say it is around zero...

Posted 1:28 p.m., December 29, 2003 (#18) - Rally Monkey
I've started an APBA simulation using projected 2004 stats for the AL. The Tigers look to be really bad again, with the biggest additions being Fernando Vina and Rondell White. The first try through a quick season the Tigers had about 60 wins. It seems pretty hard to put together a worse team while using professional talent and playing the best guys available to you. To do what the Tigers did last year takes bottom level talent AND some major bad luck.

I wonder though, how would the 2003 Tigers have done if they played at AAA? I know in Kevin Witt and Carlos Pena they have 1B/DH's who have recently put up .550-.600 SA's in AAA. Dmitri Young might put up Pujolian numbers if he faced AAA pitchers all the time.

Posted 1:33 p.m., December 29, 2003 (#19) - Rally Monkey
MGL, I see Inge on your Tiger minor league team at -11/500. He's got 840 big league at bats over the last 3 years. What's his projection based on big league stats?

Posted 3:25 p.m., December 29, 2003 (#20) - MGL
Rally,

I couldn't find a halfway decent catcher in the minors so I used Inge. I just used his MLE from last year, but of course he has been a lot worse in the majors. In fact, I don't know that anyone in recent memory has hit worse in the majors in those 840 big league AB's than Inge.

In 2001-2003 in around 250-300 PA's in the minors, he actually didn't hit that badly - around -3 lwts per 500 PA in MLE lwts. OTOH, in his last full year in the minors, 2000, his MLW lwts was -20.5 per 500, which makes you wonder why he was brought up in the first place, and also should indicate that it was no surprise that he would hit so poorly in the majors. Unless he had some outstanding defensive skills, and I don't recall ever hearing anything about that, a minor league player with a -21 MLE lwts, even a catcher, is a dime a dozen. Surely you can find a better hitting catcher at the major league minimum price. Just an example of how woeful an organization Det is. And yes, it is a pretty fair assumption that no matter how much a major team league has to spend, within reason, if it loses 119 games in a season, and if it is perennially horrible, that it is managing its affairs in an incompetent fashion.

I don't have his 2004 projection yet, but I would say that it would be in the -30 per 500 PA range, which is like a .620 OPS or something like that. What does Pecota, Shandler, Marcel, or ZIPS have him as?

Posted 3:32 p.m., December 29, 2003 (#21) - David Smyth
----"I do not think that BJ or anyone else controlled for DIPS ERA! That is the problem!"

Well, of course not, since the James study is 20 years old. Again, I didn't post that to "prove" anything, but just as a source of information. All B James showed is that, given equal overall performance (ERA, W/L %, whether good or bad), the higher K pitcher has a significantly greater long-term expectation. A pitcher who achieves his 3.00 ERA, 18 win season by means of hi Ks will tend to win more games in the future than will pitchers who do so by other means (lo BB, lo HR, lo BABIP).

Posted 4:19 p.m., December 29, 2003 (#22) - Rally Monkey
Zips has Inge at a .645 OPS, which would be a career high by a mile. There really is no excuse for him. Detroit had Michael Rivera in 2002 and gave him about a month and a half before getting rid of him for the crime of hitting like Inge. The difference, though, is that Rivera has hit well in the minors, and might be a productive player if someone gives him 840 AB to adjust to the majors.

They could have kept Robert Fick behind the plate. His defense was bad but not bad enough to justify giving up 30-40 runs offensively at the position.

Posted 4:21 p.m., December 29, 2003 (#23) - tangotiger
MGL: Marcel and I don't have access to minor league data in a downloadable form. Therefore, what should Marcel do? Give every rookie a blanket OBA and SLG of 95% of league average!! How can Marcel do that? Well, since we only compare various forecasting systems with an after-the-fact 200 or 300 PAs, this is a great way to cheat. And, if a rookie were given that much playing time, he probably performed around the league average.

If Marcel were a smarter monkey, he'd make it 90% of league average for the IF/C, and 100% of league average for the OF/1B.

The one place where PECOTA, DMB, ZiPS, and MGL trumps Marcel is with rookies and sophs.

Posted 4:22 p.m., December 29, 2003 (#24) - Rally Monkey
From the AA/AAA adjustments-

It looks like, on average, a AAA player will lose 20% of his value relative to the league, while a AA player will lose 25%. Did you look at AA players going straight to the majors or start with AA players to AAA and then add the AAA adjustment?

Posted 4:35 p.m., December 29, 2003 (#25) - MGL
Rally, sure any projection for Inge is going to be a lot higher (and rightfully so) than his career major league average for 2 reasons: one, regresion to the mean of an average young major league catcher in a sucky organization, and two, his minor league MLE's are much better than his major league numbers and should be factored into the projection.

The difference, though, is that Rivera has hit well in the minors, and might be a productive player if someone gives him 840 AB to adjust to the majors. I know of no evidence that suggests that hitters have to get "used to" the major leagues. If that is true than we would see overall a big upswing (more than the regular age curve would indicate) in performance from a player's rookie year to his sophomore year or a big upswing from a player's first 300 PA"s to his next 300 PA's or some pattern like that. Do we? I doubt it. When you see a player who comes up to the majors and sucks for a couple of hundred PA's, it is most likely a statitsiticxal fluke, just like any sucky couple of hunderd PA's for any major leaguer at any time in his career. I could be wrong but the burden of proof with ALL of these conventional wisdoms and assumption is on the conventional wisomER. Why? Because I say so! Actually, for no other reason that because conventional wisdom is usally, by an overwhelming degree, wrong. So which is more efficient - to put the burden of proof on the conventional wisdom claim or the sabermetric claim?

DS, yes, you are just repeating what I said about the high K low K thing. The point I am making is "Does a high K pitcher pitcher have a better sjort or long term future, once we control for $H than a low K pitcher. That's all. And it is a critcial question, because in this century, we now know that when doing projections for pitchers, we must control for $H (regress them a lot mroe than the other components). So when we project 2 pitchers to have an equal context neitral ERA, and one has higher K rate, we really want to know if one will indeed perform better than the other next year or 5 years from now. James' studies did not answer that. In fact, given what we know now about DIPS, James results were a foregone conlusion, since of your control for ERA, a pticher with a high K rate will actually have a better DIPS ERA than a pitcher with a low K rate, so the the high K pitcher SHOULD perform better in the future! We don't use non-DIPS ERA's anymore for projections, if we don't have to, so we won't get the same projections any more for those pitchers with the same regular ERA's but different K rates!

A simple study needs to be done to look at this question again. Look at all ptichers of a similar DIPS ERA. Break them down into 2 groups - high K rate and low K rate. Adjust for age, and look at each group's regular or DIPS ERA the next year (going forward, it doesn't matter whether you use DIPS ERA or ERA - they will be the same). That will tell you if a pitcher' K rate is predictive of future success independent of his DIPS ERA (or another good projection). Then look at the rest of the careers of both groups to see if K rate is indicative, again independent of a regular projection, of a longer or more successful career...

Posted 5:40 p.m., December 29, 2003 (#26) - FJM
The top 10 position players could reasonably be used as a proxy for a team's entire offense. (You only list 8 Tigers minor leaguers, though, not 10. You need to add a DH at least.) But the Top 5 starters cannot be used as a proxy for a team's entire pitching staff. If the average starter gets 30 starts per season and averages 5 innings per start, then the 5 starters will consume about 55% of the total IP. The other 45% will be filled by mostly inferior pitchers, with the closer (if you've got one) accounting for 8 or 9%.

Posted 6:57 p.m., December 29, 2003 (#27) - David Smyth
MGL, now I understand your point. I agree that, if 2 pitchers are the same age with the same DIPS ERA, they should have *around* the same projection for next year, whether hi or lo K. That is a short-term prediction, which does not include the longer term *attributes* of the hi K profile, which is what James was concerned with. These longer term factors are still present on a season to season basis, but are dwarked by other things, and may be therefore difficult to detect in a season x and season x+1 type study.

The main advantage of Ks over the short term for pitchers is the ability to keep balls out of play. Over the longer term, the advantage of the Ks is the potential for a pitcher to "develop" secondary level skills (BB, HR) while maintaining the K level (or at least to compensate for an age related reduction in Ks). That is a much easier task than that which faces lo K pitchers who are already good in BB and HR--they have already reached the practical limit in those areas, but do not have the raw ability to improve in the area in which they are lacking. Thus the longer term advantage of the hi K pitcher (on average, of course).

When you (a GM) obtain a good FA pitcher, you are usually looking at a 3 yr+ contract. That, depending on the pitcher's age, is long enough that independent consideration should be given to the K rate.

Posted 7:16 p.m., December 29, 2003 (#28) - MGL
Over the longer term, the advantage of the Ks is the potential for a pitcher to "develop" secondary level skills (BB, HR) while maintaining the K level (or at least to compensate for an age related reduction in Ks). That is a much easier task than that which faces lo K pitchers who are already good in BB and HR--they have already reached the practical limit in those areas, but do not have the raw ability to improve in the area in which they are lacking. Thus the longer term advantage of the hi K pitcher (on average, of course).

You are assuming that this is true. That is what I am questioning. It could be, but then again, it might not be. James' study (again, from memory) was flawed since it did not control for $H, so that the results might not lead to the aboive conclusion. IOW, all other things being equal, if you are a GM, looking at 3 years or more into the future, maybe is doesn't matter what a pitcher's K rate is.

Or it could be complicated (and non-linear). I tcould be, that all other things being equal, besides K rate, if 2 pitchers are bad, then the one who has the higher K rate is more likely to get better (as you say or imply above). If those pitchers are both already good (Moyer and Clemens) maybe it doesn't matter. Maybe at a certain age it matters and a certain age it doesn't matter.

What I was objecting to originally, was someone who said something like "Yeah, his sample ERC and therefore his projected ERA is pretty good, but I don't particularly like him because of his low K rate." For that statement to have any merit, a low K rate would have to imply either a short or long-term (or both) projection that would be worse than his normal projection, at least as compared to a pticher with the same projection, but a higher K rate.

I was questioning this wisdom, because I beleive that it has developed into a conventional wisdom (we just take it for granted that it is true), but that it was originally predicated on some falutly research OR I did not recall ANY good research that suggests that it might be true. That's my job. Questioning everything that people think or believe is true.

Kind of like "Does the unemployment rate, at least as we traditionally measure it (which I heard changes from time to time), significantly related to the state of the economy (whatever "the state of the economy" MEANS, which is another can of worms), AND does the President, his administration, or Congress have any significant influence over it (i.e., should the Pres get any "credit" when the unemployment rate goes down), etc., etc."

Posted 11:08 p.m., December 29, 2003 (#29) - Rally Monkey
"I know of no evidence that suggests that hitters have to get "used to" the major leagues."

I didn't mean to suggest that they did, although I don't dismiss it so easily. I have no idea if players need that, but in any case Rivera deserves a chance. He hit .310/.373/.502 in AAA last year while San Diego gave the majority of their playing time to Gary Bennett, who happens to be what Brandon Inge will be when he's 32.

Rivera may have defensive problems that I don't know about. He only threw out 18% of runners last year (3-16, small sample size). Big deal. Bennett only threw out 19%. Game calling? Until I've seen otherwise its too small to quantify. Passed balls? Rivera had quite a few in Detroit, but he was sometimes catching a knuckleballer, Sparks. Dude deserves a fair shot.

Posted 9:33 a.m., December 30, 2003 (#30) - tangotiger
I know of no evidence that suggests that hitters have to get "used to" the major leagues.

I know of no evidence that suggests that hitters are equally disadvantaged as they play against better competition.

It's rather obvious that some players will have different translation numbers. The 3 questions to ask are:
1 - can we spot these players using their profile/quality numbers
2 - how much impact does this have
3 - to what extent does scouting help us find these players

MGL's implicit answers are:
1 - no
2 - none
3 - none

And my reply to that is: it's alot more fun to at least try to find the answer, than to write a long-winded paragraph response with no statistically significant argument.

(Man, it's alot easier to argue with MGL this way.)

Posted 1:28 p.m., December 30, 2003 (#31) - MGL
MGL's implicit answers are:
1 - no
2 - none
3 - none

Come on, you know that my impicit or explicit answers to 2 and 3 are not "none." It is "very little." The problem with scouting is not that it is inherently unimportant. The problem is that it is tainted with the same misconcptions and misinformation that exist in all of levels of professional baseball, such that it is a lot less effective than it could be. Scouts should be trained to complement statistical anlysis. They (socuts) think that they supplant it of course. That taints the whole process. A simplistic example:

Player A has great minor league hitting numbers in 1000 PA's. His MLE OPS is like .800 or .850. He was never considered a great propect though for whatever reasons (already scouts are biased against this guy, although perhaps for good reason, but perhaps not). He gets called up and in 68 PA's hits around a .400 OPS. What is the scout who is watching him in those 68 PA's going to say: "He was completely overmatched. He is either not ready for major league pitching or he never will be." If he is a grade A prospect, and particuarly if this scout or a friend of his, originally touted this guy, he will lean towards, "He's not ready yet." If he is a grade B or C prospect, he might lean towards, "He's not really major league material."

Of course, all of these types of scouting reports are crap! Without the scout understanding the importance of the MLE's, and without him unbderstanding the sample size issues of the 68 major league PA's, and because of his bias going in, based on his pre-conceived notion of whether this player is a true prospect or not, his "scouting report" is going to be so tainted as to render it almost worthless. That is why I think scouting is almost worthless. Not to mention the fatc that it is done by "old baseball guys" whose average IQ is probably around 93 and who make anywhere from zero to $50,000 a year.

So what would the proer way to have "scouted" our guy with 68 major league PA's. Well, anyone who hits .400 (OPS) in 68 PA's is going to "look" terrible in those 68 PA's, even if they were Barry Bonds. I'm sure he's had spates of 68 PA's where he has looked pretty bad (OK, maybe not Bonds, but how about Sosa). Here we have a minor league player who has torn the cover off the ball for 1000 PA's and then looks bad in 68 PA's in the mjaors! What can a scout tell us? I'm not sure he can tell us anything, to tell you the truth! We already know that guy who hits .400 is going to "look" bad. He probably swung at lots of bad pitches, etc. He also probably (defintiely) does that occasionally in the span of 68 PA's in the minors as well! Part of a batter's flutcuation in batting perforemnce is "looking bad" and swinging at bad pitches, for whatever reasons. All a scout should really say in this situation, is "Well, he sure is a good hitter in the minors. Those 68 PA's were probably just a fluke. After all, it's only 69 PA's comapred to over 1000 he's had god success on in the minors. Plus, don't forget, he faced Schilling and Maddux in 17 of those 68 PA's. Plus, he seemed particularily nervous in the show - even more so than the avrerage rookie call-up. Like I said, I can't really tell much from 68 PA's. The guys been playing baseball for 20 years with 100's of thousands of PA's at all levels. We've seen him in the minors for a couple of thousand PA's and he has donme pretty well there so far. And let's face it, AAA baseball is no sandlot league. I'd like to see you (talking to managment) hit a AAA pitcher throwing 95 MPH! Don't forget, a lot of these AAA pitchers will be in the majors soon. Where do you think all of these major league pitchers come from? Plus, there are lots of major league pitchers who are no better than the average AAA pitcher. Criped this guy hit .375 in 1000 PA's in AAA, and you want to know why he hit only .147 in 68 PA's in the show? I am a scout, not an oracle, you idiots (agaion, talking to baseball managment)! How the hell do I know WHY he didn't hit in 68 PA's? Yeah, he looked overmatched, but wht do you think he is going to look like when hitting .147? You guys ever heard of bad luck? 68 PA's for Christ sake! You want my opinion. Let him bat for 680 PA's and then we'll talk. Until then, shut up and do your job and let me do mine! Uh, what is my job again?"

That's what a scout should say...

Posted 1:35 p.m., December 30, 2003 (#32) - MGL
I know of no evidence that suggests that hitters are equally disadvantaged as they play against better competition.

Neither do I, but you know as well as I do where the burden of proof lies, and you also know as well as I do that the answer is probably that there is not a huge spread among how much different hitters are truly disadvantaged as they play against better competition. And we can get some idea by looking at the variance among rookies as comapred to their MLE's, can we not?

Posted 2:18 p.m., December 30, 2003 (#33) - tangotiger
Don't worry, I'll be working on this too.

Posted 2:35 p.m., December 30, 2003 (#34) - MGL
Here is one thing I would look at:

What is the average variance around a minor leaguer's one year (say min 300 PA) MLE OPS in his rookie year (rookie year, also min 300 PA)?

Compare this to the average variance around a major leaguer's previous year OPS and his next year's OPS.

Or would the year to year r tell us anything about whether and by how much a player's true talent fluctuates when he jumps from majors to minors versus from one year in majors to another year in minors?

The results of either ot these analyses migh inform us (confuse us) on not only whether there is lots of variation on how well minor leaguers "adapt" to the mjaor leagues, but it might inform us on how good our MLE's are.

IOW, even if all minor leaguers adpated about the same (i.e., going from minors to majors was the same as going from majors to majors once you adjusted the minor league stats to make them equivalent to the majors), if the MLE was no good, we would see a larger variance around those MLE's in the next year's (in the majors) stats than in majors to majors stats, would we not?

Posted 2:58 p.m., December 30, 2003 (#35) - 3AM
I'm trying to do a study on DIPS ERA and SO rate, as MGL suggested. But I'm having trouble calculating historic DIPS ERAs. According to Voros' DIPS ERA calculations at http://www.baseballstuff.com/fraser/articles/dips2.html I need to know how many 2B and 3B teams gave up, and I can't seem to find it. Baseball Reference doesn't list pitcher 2B/3B totals, and neither does the Lahman database (that I can find, at least).

Any suggestions?

Posted 3:16 p.m., December 30, 2003 (#36) - Patriot
You don't really need to use Voros' exact procedure. Tango's DIPS estimator should work fine for that study.

Besides, Voros' second article is not his actual DIPS procedure anyway. That is in his first article and requires league D and T, not team.

Posted 3:17 p.m., December 30, 2003 (#37) - Patriot (homepage)
BTW: John Jarvis has team D and T allowed

Posted 3:49 p.m., December 30, 2003 (#38) - Enos Cabell
Not to mention the fatc that it is done by "old baseball guys" whose average IQ is probably around 93 and who make anywhere from zero to $50,000 a year.

Ugh.

Posted 3:58 p.m., December 30, 2003 (#39) - tangotiger
To control for "quality" of pitcher, simply do:
FIP = (13*HR+3*BB-2*SO)/IP

If you want to ERAize it, ERA = FIP + 3.2

If I were to do this study, I'd do something like:
1 - Look for all pitchers aged 25 to 28, min 3000 PAs (basically, starters)
2 - Calculate their FIP
3 - Calculate their SO/PA
4 - Calculate how many PAs they have from age 29 to 36
5 - Run a regression of 2 and 3 against 4

Posted 4:28 p.m., December 30, 2003 (#40) - 3AM
Next question: If I make a DIPS ERA+, do I need to use PF? And if so, would it look like this:

(LgDIPSERA/DIPSERA/PF)*100

Assuming PF is the number from BR.com. Would I have to divide that by 2 before applying it?

Posted 4:33 p.m., December 30, 2003 (#41) - tangotiger
I have no idea what the PF is at b-r. BUT, in your case, why bother with it. You are not trying to pinpoint the results of any one pitcher. The PF will have virtually zero impact here.

Posted 5:04 p.m., December 30, 2003 (#42) - MGL
I'd do it differently, at least at the outset, as I hate doing regressions if I don't have to, because no one knows what they mean (rehotically speaking of course). Plus, and this is an important point for all of you "regression fans" (Tango). Regressions are only necessary when you want to know about "best fit" type relationships or you want to know if correlations apply "across the board" (an r from a regression analysis is basicaly an average correlation among all data points). When you just want to know whether two groups of elements (in this case, high K and low K ptichers) differ in terms of another variable (longevity, future long or short-term perfoemance), you are way better off doign somethig a lot simpler, more striaghforward, and easier to follow and understand, than a regression, at least for starters! Like with Tango's example, rather than a regression, take the same pitchers in the sasme age grtoups with the same nubme of min TBF's, and split them into 2 groups, high K and low K. The creiteris for for each group does not matter. You just need to strike a balance between making the 2 groups as distinct as you can while still preserving some semblanc eof sample size. Anyway, then just look at the number of PA's each group has from age 29 to 36, and also look at their average ERA or DIPS ERA before and after. Or first control for DIPS ERA before and then look at DIPS ERA or regular ERA after. For this last point, split each K rate group (low anf high) into 2 groups. High ERA and low ERA. Make sure that both low ERA groups have around the same ERA and both high ERA groups have around the same ERA. Then look at the ERA's of these 4 groups in the "post" period. All of this is sort of like a "poor man's" regression, but it is way easier to follow.

The 2 hypotheses for issue number one, is that there will be a significant difference between the number of future (age 29-26) PA's between the high K group and the low K group or their won't. OK, unless you control for ERA, like you essentially do in the multiple regression, you won't know if a difference wad due to the K rate or the ERA (pitchers with better ERA's, which tend to be the high K pitchers, generally have a longer career). We want to know which one is the cause. Anyway, just break dowen the 2 K groups into high and low ERA groups as I suggested for the second part of ths study (looking at if K rate affects future perforemcne, indpendent of a pojection based on past DIPS ERA), for the first part and do the same thing. Here is the reason I prefer this kind of an analysis (I don't know what it is called, if anyhting) rather than a regression:

Let's say you do the regression and you get an r between K rate and futrure number of PA's of .4 (or .3 or .5). What does that tell you (in English)? It tells me nothing and I assume it tells the average reader nothing unless Tango or someone who does the regression and is familiar with regressions tells you what it means! Then you just have to take their word for it!

If you do it my way, and the data says that group A, the low K rate pitchers with an average DIPS ERA of 3.8, had an average of 3200 subsequent PA's and the high K pitchers with the same average DIPS ERA had an average # of subsequent PA's of 4500, the results speak fopr themsleves. Of course, you then ahve the issue of sample sizes ans significance, but at least the aveag eperson, and even a sabermetrician, can relate to and undestand that kind of a result a lot better than "an r of .4."

Anyway, just my tirade on why I hate regression analyses and the use of other similar rigorous scientific tools when it comes to analysing and presenting baseball issues....

Posted 8:01 p.m., December 30, 2003 (#43) - NTR Bill James
MGL is right on the lack of impact on readers of regression analysis. Even if I knew how to do them, I'd never present them as is.

Posted 10:20 p.m., December 30, 2003 (#44) - Tangotiger
I meant to do the regression analysis as a quick way. I almost always do it the "splitting the data into groups" to present to the reader.

As for what the regression will give you, again only for the researcher's understanding, and not to explain to a reader as it would be boring, you'd get your slope and standard error for the slope. That by itself will tell you what you need to know.

Instead of using the actual K/PA rate, you could create a dummy flag as +1, 0, -1 for a K/PA rate above a certain threshhold. Essentially, you are splitting the group into 3. Again here, this kinda of merges the "splitting analysis" with the power of regression.

Yes, of course, statistical analysis is boring to explain, present, and read. The splitting into various control groups helps to explain to the interested reader what the boring regression analysis says.

Posted 10:33 p.m., December 30, 2003 (#45) - Rally Monkey
I've got some alternative MLE factors in case anyone is interested. Using the major and minor league EQA data from the Prospectus site, I matched players who played in both AAA and the majors. If a player had 25 PA in the majors and 390 in AAA, I prorated his EQR to 25 plate appearances. Then took the totals and figured runs per 26 outs. The minor leaguers were at 5.77 runs per game in AAA, only 3.58 in the majors, which means they were only 62% as effective.

There's a problem with selective sampling, as AAA players recalled are likely to have played over their heads, while Major leaguers demoted were likely to have sucked. Therefore I regressed 33% to the league average for both groups.

The end result is these factors for runs per game for AAA:

Int .764
PCL .711

example: Bobby Crosby AAA 7.28 RPG, Maj 5.17 (no park factors)

I repeated the process for players moving from AA to AAA, and high A to AA, then combined the factors to get an MLE factor for each league.

AA leagues:
EL .610
SL .676
TL .610

ex Miguel Cabrera A 9.19 Majors: 6.22

High A:
Cal .453
Car .534
Fla .568

Just in case anyone wants to see what Casey Kotchman might do in a sim jumping straight from A. I've got him around 4.3 RPG. Its not Palmeiro, but its not far from Spiezio. Pretty good for a 20 year old.

Posted 12:32 a.m., December 31, 2003 (#46) - MGL
Interesting, Rally. How do your EQR ratios compare to my component ratios? I'm not sure how to convert them to the same "currency."

I get an average of -13.6 MLE lwts runs per 162 for an average minor league player (AA and AAA combined). If we figure that in the major leagues, a player creates 55 runs per 500 PA, -13.6 is 41.4, which is 75% of major league production. For AAA, the average MLE is -11.0, which is 80% of the major league level, and AA is -16.2, which is 71%.

According to Dan S., and he probably got this from BJ, a player loses around 18% going from AAA to the majors. I forgot what perxentage BJ uses for AA. Clay D. says about 15% per level, which implies 15% for AAA and 28% for AA or something like that, which ar close to my numbers.

I don't know if I did that right, in order to compare it to your numbers , but that sounds a lot different from what you get.

Your observation about selective sampling being a huge problem for computing or verifying MLE's is right on the money for the exact reasons you mention. But here is the deal:

If you are using a player's actual minor league stats from on eyear and that player is a good player (he is from the population who normally gets called up), you don't need to worry about the selective sampling problems. You can just use the MLE's that you came up with in the same manner that you used, which is correct. If you are trying to figure out the MLE of a bad or average minor league player, i.e., a player who is NOT a normal candidate for promotion, then your original MLE coefficient or coefficients will not work correctly. Here's the important point:

Once you regress your coefficients to account for thiz selective sampling problem, as you did, you can't take a player's (actually ANY player's - good or bad) actual sample minor league stats and apply the regressed MLE coefficients! You have to regress the players true minor league stats first (towards that of an average minor league player), to reflect his true talent in the minors and THEN apply the regressed coefficients.

If you don't do the regression on the minor league stats, that's fine if, as I said, your player is a good player (and lucky). Then you can apply your original unregressed MLE coefficients.

It is a little tricky, but trust me, I've thought about these thigs for years!

Here's why:

As you correctly said, MLE's gotten by just using the weighted ratios of stats (we'll use OPS) of players who played in both the minors and majors, say in the same year, are not "true" MLE's. They don't accurately reflect the true difference in talent level between majors and minors. That is where James went wrong, or at least he didn't explain it correctly. He apparently didn't realize the selective samplng issue or chose to ignore it.

All players who make it into the majors, and thus have minor stats AND major stats which we can use to come up with MLE coefficients, as a group, are both good and lucky. Let's say we have 100 players who get called up and they each get 300 PA's in the majors and they had 300 PA's in the minors. Let's forget for now about the fact that if they suck in the majors they get sent down, even if that suckiness is bad luck which it will always be, at least partially. (MGL's rule # 3,456: All above average play, on the average, is good talent plus good luck, and all below average skill is bad luck plus poor talent!)

Anyway, let's say these 100 minor league players have a group OPS of .900 (in the minors) for those 300 PA's. Their true group OPS in the minors is a lot less - proably around .800 (assuming an average OPS in the minors is .750). That is because each of these players' high OPS is based on only 300 PA's, for purposes of this experiment. In real life, for a player to get called up, he probably needs more than just that. But in any case, for all players who get called up, they got lucky, as a group, in the minors. IOW, their true OPS in the minors is a lot lower than their sample OPS for whatever time period you are using.

So now, these guys all get called up and they hit .750 as a group in the majors. When we calculate our MLE OPS coefficient, we are going to use .750/.900, and conclude that our players lost 17% of their OPS going from minors to majors! Wrong!

The .750 is in fact a good estimate of their true OPS in the majors. No selective sampling there becuase we have already chosen our group and we go forward with no attrition. Selelctive sampling only occurs retroactively, when you choose your group after "observing" the data already. So .750 is a fair estimate of this group of players' actual major league OPS talent.

BUT, the .900 is not! That is not a fair estimate of this group's minor league OPS talent. It is .825 (or something like that, maybe .800 - I am regressing each player's sample minor league OPS in 300 PA's towards .750 - 300 PA's, the regresson should be like 70%). So the true ratio between minor league OPS and major league OPS is NOT .750/.900, it is .750/.825, or 91%, not 83%. I am making up these numbers, but it doesn't make a difference. The important point is the distincion between our group's sample OPS and their true OPS in the minors.

So basically, you have several choices in how you compute and use your MLW coefficients. You can use the the .750/.900 or83%, but that is only going to apply to players in the minors who hit around .900 in the minors and are likely to get called up. Or you can regress those .900 players' minor league OPS's first in order to estimate their true talent in the minors and the use that number, say .825, with the .750 in the majors to get a true MLE coeff. of 91%. Now you can use this coeff. for ANY player in the minors, good or bad, but before you apply that coeff. (the 91%), you have to first establish the true OPS talent in the minors of the player in question. If the player in question has a minor OPS of .900 in 300 PA's, first do the regression and then apply the 91% coeff. If a player hits .700 in the minors in 300 PA's, first regress that .700 to maybe .735, and THEN apply the 91%.

Fianlly, a third way, which I think is what you did, was to first compute the .900/.750 or 83% coeff., and the to regress the coefficient towards 1. If you regressed it 50%, you would be at around 91%. Now you still have to apply that regressed number to a player's regressed mior league OSP, which is where you went wrong in your analysis with Crosby...

Posted 8:45 a.m., December 31, 2003 (#47) - Rally Monkey
If I was doing a forecast for Crosby, I'd use the regression and his 2002 totals. I never got that far with him, Cabrera, or Kotchman.

You say there's no need to regress what the players did in the majors. I would agree if I was only looking at callups where players could not be sent down, but here I'm looking at all players who played in both AAA and the majors for 1 year. This includes people like Brandon Phillips, who was (to quote you) bad and unlucky in the majors. So I think there needs to be some regression on major league stats, although I haven't looked in detail to see how many callups vs sendowns there were. Not to mention those shuttle-situations, where a player gets called up and sent down many times over the year to really confuse the issue.

Posted 8:55 a.m., December 31, 2003 (#48) - Rally Monkey
I remember the 82% from the Bill James MLE's. I question that figure for two reasons:

1) I never his study that led to this figure, and suspect that he didn't consider selective sampling issues.

2) It was 20 years ago. Even if those numbers were right at the time, why should we assume that nothing has changed? I think there are a lot fewer really good hitters trapped in the minors, partially due to James making people pay a bit more attention to minor league stats. While Graham Koonce is good, he's no Kenny Phelps.

Posted 1:19 p.m., December 31, 2003 (#49) - MGL
I made a mistake in my last post. You HAVE to regress the player's sample minor league stats to estimate his true minor league stats, sa you would in the major leagues, and THEN you would apply the MLE coefficients. The coefficients that you would use would be the ones that accounted for the selective sampling (the higher ones). That's the only way to do it. You can't use the lower MLE coefficients first and THEN regress the final MLE, as I said earlier you could do. That doesn't work. I am in the process of doing more work on MLE's. I'll get back to this thread with the results.

In the meantime, I converted all of my MLE lwts stuff to an MLE OPS+, which is simply a player's MLE OPS divided by the average OPS in the major leagues. IOW, a minor leaguer's MLE OPS is exactly the same as a major leaguer's OPS+.

Here are some more lists: I will also send a file of all AA and AA players' MLE's from 2001 to 2003 to Tango and he can post them somewhere here.

2003 Best OPS+, min 300 PA

Name, 2003 age, team, pa, 2003 OPS+, 2002 OPS+, 2001 OPS+

M. Cabrera, 20, CAR, 303, 123,x,x
F. Seguignol, 28, COL, 446, 122,x, 106
B. Martin, 27, ELP, 387, 117,76, x
B. Larson, 27, LOU, 315, 117, 124, 84
B. Jacobsen, 28, TEN, 521, 115, 99, 105
B. Myrow, 27, TRE, 591, 114, 100, x
Bu. Crosby, 27, COL, 384, 113, 79, 92
J. Bay, 25, POR, 373, 113, 100, x
J. Leone, 26, SAN, 558, 113, x, x
G. Koonce, 28, SCO, 602, 112, 106, 99
R. Ludwick, 25, OKL, 360, 112, 103, 91
T. Sledge, 26, EDM, 572, 111, 96, 91
Bo. Crosby, 23, SCO, 543, 110, 90, x

Looks like most of these players are the real McCoy. Koonce, Bay, Myrow, Jacobson, Larson, and Seguignol are particularly impressive ove the last 3 years.

Here are the best players in OPS+ over the last 3 years combined, min 800 PA, AND who have never played in the majors. Their OPS+ is also age adjusted this time, however, there is no weighting of the 3-year OPS+ averages. I take each year's sample MLE stats and adjust them to the level of a 28 yo (doesn't mattter what age they are all adjusted to at this point). Then I average the 3-year stats (weighted for PA of course). That is each player's 3-year average "as if he were a 28 yo. I then "reverse adjust" those "28 yo" stats to their current age. IOW, it is just like a projection, but no year by year weighting is used. Each year int he 3-year period is given the same weight as any other year.

Interesting list, as I have never heard of half of them.

Name, 2004 age, last team (org), AA/AAA, pos, 3-yr PA, 3-yr OPS+

B. Myrow, 28, TRE (NYY), AA, 3B, 826, 113
B. Jacobsen, 29, TEN (SL), AA, 1B, 1296, 107
T. Meadows, 27, WCH (KC), AA, OF, 854, 105
J. Deardorff, 26, NBR (MIN), AA, 1B, 1239, 105
J. Gall, 26, MEM (STL), AAA, 1B, 1126, 104
T. Alvarez, 26, NVL (PIT), AAA, OF, 1126, 104
J.D. Closser, 24, TUL (COL), AA, C, 822, 101
A. Phillips, 27, COL (NYY), AAA, 2B, 801, 101
T. Sledge, 27, EDM (MON), AAA, OF, 1630, 101

Surely some of these guys belong in the majors! What happened to Phillips? He didn't play much in 2003.

Posted 6:26 p.m., December 31, 2003 (#50) - MGL
BTW, Rally, the selective sampling of only "good and lucky" players being called up will of course make it look like the "drop-off" from minors to majors is bigger than it really is. However, the other selective sampling, the fact that if a player hits poorly in the majors after being called up, he tends to be sent back down, will force the MLE coefficient back in the other direction, such that one selective sampling tends to cancel out the other, assuming that you weight the major and minor sample stats by the "lesser of the two PA's," as you correctly do. For those of you whop don't understand what this means: If player A has 300 PA's with a .900 OPS in the minors and 100 PA's of an .800 OPS in the majors and player B has a .950 OPS in 200 PA's in the minors and an .750 OPS in 300 PA's in the majors, here is how we figure the MLE OPS coefficient:

Player A

He has 300 PA's in the minors and 100 PA's in the majors so we weight both his major and minor OPS by 100 (the lesser of the 2 PA's). For player B, we weight his minor and major OPS by 200, the lesser of HIS PA's. So for "both players': average minor OPS, we have .900 times 100 plus .950 times 200, divided by 300 or .933. Their average major OPS is .800 times 100 plus .750 times 200 divided by 300, or .767 (we use the same weights for the major and minor OPS'). So the OPS MLE coefficient is .767/.933 or 82.2%. As both Rally and I explained, this is NOT the true amount that any player loses when going from minors to majors (assuming that these were real numbers and that the sample size were much larger), as the .900 and .950 do not repreent these player's true minir league OPS talent whereas the .750 and .800 do represent their true major league OPS talent. It should be more like .767/.850, or something like that (the .850 beinf closer to the weighted average of these two players' true OPS. What Rally explained, and I explained in this post, in additon to that bias, there is going to be a "weighting" bias such that players who do poorly in the majors will have fewer PA's in the majors than players who do well, driving the observed coefficient back up towards 100% (the other selective sampling issue drove it down, away from 100%).

Anyway, I'll have more info as I do more research on proper MLE's using my minor league database. As far as I can tell, the only way you can retain any semblance of linearity in applying MLE's is to first regress your sample minor league stats to convert them into a true value and THEN to apply some MLE coefficient or coefficients in order to estimate a true major league value, IF you want to be able to calculate MLE's for ALL minor league players across ALL sample sizes and all levels of talent. For example, a player with a .900 OPS in 100 PA's in the minors obviosuly has to have a different "true" MLE in the majors (where the definition of a true MLE is that players true OPS [talent] if they had played in the majors or if they get called up to the majors right away) than a player who has a .900 OPS in 1000 PA's...