Individual Poster Page

See copyright notice at the bottom of this page.

List of All Posters

 


Felipe Alou: Is He Afraid of the Walk?

November 13, 2002 - MGL

Tango, this is more than just a controlling for age situtaion. Since players significantly increase their walk rate throught their careers, you definitely need to look at pre versus with or post-Alou, controlling for the normal walk rate progression with time. If you don't, and you study any manager or any team, it will probably look like no manager or team negatively impacts a player or players' walk rate, simply becuase time has elapsed...

I like the study, but unless you use a control group or you control your study group for age or at least for "time progression" I don't think you arre likely to get any meaningful results...

I agree with David that even if Alou does not impact a player's walk rate while he plays for his team and/or after he leaves, his disdain for walks may impact the team by him helping to sign and/or him giving more playing time to "aggressive players". Of course, the Alou quote belies the fact that Alou may have this "anti-walk" philosophy...

BTW, is no one concerned that Alou is obsessed with the sac bunt? He executed 65 non-pitcher bunts this year (which is around 110 attempts). The Expos also had 118 SB and 64 CS. This concerns me too. That's a lot of stolen base attempts with absolutely nothing to show for it. This is definitely not a guy that I would want managing MY team...


Felipe Alou: Is He Afraid of the Walk?

November 14, 2002 - MGL

My bad Vinay! Whoever it was ordered an awful lot of bunts! Well, at least I didn't get flamed as I would have on Clutch Hits, will I will not post on anymore, BTW...

I also agree that I don't think that too many batters will change their approach to hitting because of a particular manager or coach...


Felipe Alou: Is He Afraid of the Walk?

November 14, 2002 - MGL

David, it wasn't so much being flamed (I can take it and I bring some of it on due to my sometimes "in-your-face" style of critiquing other people's comments, of course), but the childish posts on Clutch Hits. I am not used to those kinds of "general circulation" boards I guess (I'm used to FanHome, although it has died since it changed venues - mainly becuase I think the new venue sucks - hard to maneuver, etc.).

Actually I can't believe the number of people who take the time to read Primer (or even know about it) and then take further time to post something that belongs in an AOL teen chat room. Also, I have little patience for unimformed, misguided persons who are not willing to learn.

If you want to see some examples, I guess look at some of the posts on the Tejada MVP thread and the Clayton versus Ordonez thread (I think). It's not that big a deal. I don't care much about the flames. It's just too frustrating and time consuming to wade through all the crap on Clutch Hits; I've got better things to do (not that I ever get to them). Thanks for some of the sympathy though guys!

Although my style is not always conducive to "acceptance" (i.e., I often annoy and piss people off) I try and add as much value as I can to these boards as well as to the field of sabermetrics in general - not becuase it matters one iota to the general population (it doesn't) or not because it makes the game of baseball any better (it doesn't), but simpy because I like it and it keeps me out of trouble (and out of class often-times - but that's another story - at least I only have one more semester to go)...


Banner Years

October 31, 2002 - MGL

Excellent work!

You lost me a little on this one (you have a great writing style which generlaly makes evetything crystal clear. Either you deviated a litlle from such a style of I syddenly became thick. The latter is entirely possible. In any case, if you could explain the following (as if I were a 6-yo child).

"Let's put it altogether

Let's go back to our banner year groups. If we assume that 14% of the weight should be given to regression towards the mean, then we can use the following weights to predict next season's performance level:

24% - Year 1 24% - Year 2 38% - Year 3 14% - average

By using these weights, we can predict the Year 4 values that were produced by the last two studies. As a shorthand, you can say 1 part average, 2 parts each first two years, 3 parts third year."

(Also why the blue font? Or is that juts my browser? I can't seem to highlight, in order to "copy and paste" any of the blue text!)

For those of you who question why 3 149% years followed by a 142% year is "3 lucky" years and one "more indicative of talent" year, that is exactlty true! It's hard to explain why. I'll try. First of all, the 142% is exacty what we would expect after 3 years of 142%!

ANY TIME WE SAMPLE A PLAYER'S TALENT (1 YEAR, 2 YEARS, 5 YEARS) WE EXPECT THAT HIS TRUE TALENT LEVEL IS EXACTLY EQUAL TO THE SAMPLE MEAN (LWTS RATIO, OPS, OR WHATEVER) PLUS A REGRESSION TOWARDS THE MEAN OF THE POPULATION THAT PLAYER COMES FROM!

Without knowing the age, height, weight, etc, of that player, we have to assume that that player comes from a population of professional baseball players only. So the mean towards which our 3-year sample will regress is 100% (the mean normlalized lwts ratio of all players, which is by definition, 100%). And of course, the smaller the sample we have (say 2 years of 149%, as opposed to 3 years), the more we expect the next year to regress. Without doing the work, I KNOW that 2 years of 149% will be followed by something LESS than 142%.

Any given year, where we don't know a priori, what the value of that year is, is neither expected to be lucky or unlucky. That is the 142% year - simply a random (for players who already had 3 good years, of course) year whose value is unknown before we calculate it. Therefore, we neither expect it to be a lucky or an unlucky year.

The 3 149% years, ARE BY DEFINTION LUCKY YEARS!


Banner Years

October 31, 2002 - MGL

Excellent work!

You lost me a little on this one (you have a great writing style which generlaly makes evetything crystal clear. Either you deviated a litlle from such a style of I syddenly became thick. The latter is entirely possible. In any case, if you could explain the following (as if I were a 6-yo child).

"Let's put it altogether

Let's go back to our banner year groups. If we assume that 14% of the weight should be given to regression towards the mean, then we can use the following weights to predict next season's performance level:

24% - Year 1 24% - Year 2 38% - Year 3 14% - average

By using these weights, we can predict the Year 4 values that were produced by the last two studies. As a shorthand, you can say 1 part average, 2 parts each first two years, 3 parts third year."

(Also why the blue font? Or is that juts my browser? I can't seem to highlight, in order to "copy and paste" any of the blue text!)

For those of you who question why 3 149% years followed by a 142% year is "3 lucky" years and one "more indicative of talent" year, that is exactlty true! It's hard to explain why. I'll try. First of all, the 142% is exacty what we would expect after 3 years of 142%!

ANY TIME WE SAMPLE A PLAYER'S TALENT (1 YEAR, 2 YEARS, 5 YEARS) WE EXPECT THAT HIS TRUE TALENT LEVEL IS EXACTLY EQUAL TO THE SAMPLE MEAN (LWTS RATIO, OPS, OR WHATEVER) PLUS A REGRESSION TOWARDS THE MEAN OF THE POPULATION THAT PLAYER COMES FROM!

Without knowing the age, height, weight, etc, of that player, we have to assume that that player comes from a population of professional baseball players only. So the mean towards which our 3-year sample will regress is 100% (the mean normlalized lwts ratio of all players, which is by definition, 100%). And of course, the smaller the sample we have (say 2 years of 149%, as opposed to 3 years), the more we expect the next year to regress. Without doing the work, I KNOW that 2 years of 149% will be followed by something LESS than 142%.

Any given year, where we don't know a priori, what the value of that year is, is neither expected to be lucky or unlucky. That is the 142% year - simply a random (for players who already had 3 good years, of course) year whose value is unknown before we calculate it. Therefore, we neither expect it to be a lucky or an unlucky year.

The 3 149% years, ARE BY DEFINITION LUCKY YEARS! That is because we purposefuly chose players with 3 good years, relative to the league average. Any time we purposely choose goof or bad years, we are, again, by definition, choosing "good (or bad) AND lucky (or unlucky) years. That is why all good or bad years will regress towards more average ones!

Beleive it or not, now that we have 3 years of 149% followed by 1 year of 142%, we have a "new" sample talent of around 146.7 (ignoring the 1st 149% year, as it is gettin gold). In fact, let's bump up the 146.7 to 147 because fot he first 149% year. Even though this is our "new" talent sample for our player, it is still not our best "estimate" of his talent, i.e., our prediction for th enext year! We still think that all 4 years have been lucky (remember all samples above average are automatically lucky and all samples below average are automatically unlucky; how lucky or unlucky delends upon the size of the sample; here we have 4 years of above average performance - it is still lucky performance, but not by that much) so we think that our 147% projection is TOO HIGH! Again, without looking at the database, I can tell you that the 5th year will be something less than 147% - probably around 145% (maybe less, becuase we may start to have an age bias - th elonger the sample you look at, the more likely it is that you are looking at older players).

All that bein said, alrthough I don't think it is a problem either, I would address the age thing - either control for age or at least include it in your results.

I would also do some park adjusting or at least inlcude some park info in your results. I am a littel concerned that a banner year is weighed towards players who have changed home parks, so that, for example, banner year followed by 3 average years will tend to be hitters park followed by pitcher's park, so that in 5th year, the player is more likely to be in pitcher's park, whereas 3 average years followed by banner year suggests that the player is more likely to be in hitter's park for 5th year. This would "screw up" the weighting system...


Banner Years

October 31, 2002 - MGL

Stephen, there is no falw in my argument! All players (as a group, NOT every single player - heck some 149/149/149 are actually 155 players!) will regress towards the mean, because...

Some of those players are actually 150 (let's ignore the exactly 149 players) or better players and got UNLUCKY, and some of those players are less than 149 players and got lucky. Those are the only two choices. The chances that any given player is less than a 149 player and got lucky is MUCH higher than the chances that he is better thana 149 player and got unlucky, sinply because there are many, many more sub-149 players.

The chances that a random 149/149/149 player if actually a sub-149 player who got lucky as compared to an above-149 player who got unlucky goes down as the sample size of the 149 group goes up (for example 149/149/149/149). However the upper limit (when the sample size is infinite) of the real talent of a sample group of players who have a sample performance of 149 is 149! It can never be higher and it can never be exactlty 149. It must be lower! That is why all all samples of performance above or below average will ALWAYS regress towards the mean of the population they come from. This is not my argument or opinion. It is a mathematical certainty, based on the "upper limit theorem" I describe above.

The only caveat is the definition of the "population they come from". In Tango's study, he looked at ALL players. Any player who had a 149/149/149 period qualified. Yes, this group is comprised of mostly good players (140, 135, 155, etc.), but they still come from a population of all ML players (technically all ML players who have played full time for 3 consecutive years, which is probnably an above average population, so they will not regress towards 100%, but maybe 110%). Now if we only look at first baseman or players over 6 feet tall, then the number towards which we regress will change...

David, your writing is excellent (isn't that what I said?). I just got hung up on the part I quoted in my last post. Could you re-explain that part please? As I also said, it may just be me being thick. Why did you and DS claim that I said that you didn't write the article well? That's an example of only telling part of the story and thereby disttorting the truth (like politicians and commentators do all the time in order to prove a point). I know youi guys didn't intentionally do that, but it is a bugaboo of mine...


Banner Years

November 1, 2002 - MGL

[i]I hope you guys can indulge a stats novice for a minute.

It's been a few years since I took true score theory, but from what I remember, outcomes are a function of true score plus measurement error. So, in other words, in some book in heaven somewhere, it may be written that Barry Bonds is a .320 hitter. Everything else represents measurement error. I think I understand regression to the mean. If the average baseball player hits .280, then we would expect Barry Bonds to follow a "true score" season with something less than .320. If I were a betting man, I would go with that.[/i]

You actualy have a nice handle on what's going on! Basically any player who hits better or worse than average over any time period of time is "expected" to hit closer to the mean than his sample hitting indicates. It's as simple as that. It is not conjecture. It is a mathematical certainty. It is a fundamental aspect of sampling theory. I think you completely understand how that works.

Given that we have a sample of a player's hitting (1 year, 5 years, whatever), that sample number is ALWAYS the "limit" of our "best estimate of his true talent" which is, of course, the same as his projection. For example, if Bonds' sample BA is .320, that is the "limit" of his true BA. Now the only thing left is to determine if a player's sample performance, like Bonds' .320 BA, is the upper or lower limit of his true level of perforamnce. The way we do that is simple, once we know the mean performance of the population that our player was selected from. If that mean is less than the player's sample performance than the sample performance is the upper limit of his true talent. If it is greater, then his sample performance is the lower limit of his true talent. In practice, it is usually easy to guess whether that mean is greater or less than a player's sample performance. In some cases, however, it it is not so easy.

For Bonds, if his sample BA is .320 we are pretty sure that no matter what, the mean BA of the popualtion that he comes from is less than that, so we estimate his true BA at something less than .320. That doesn't mean that we KNOW or that it is 100% certain that his true BA is less than .320. That's where a lot of people are making a mistake. There is a finite chance that he is a true .320 hitter, a true .330 hitter, or even a true .250 hitter who has been enormously lucky. All these things have a finite chance of being true. It's just that if you add up all the various true BA's times the chances of their occurrence, sampling theory tells you that you get a number which is closer to the population mean than his true BA. How much closer is completely a function of how large your sample of performance is and nothing else.

The other tricky part that gets people in trouble is "What IS the population of players that a particular player comes from and what is the mean of that population?" After all, that is an important number since that is the number that we need to regress to. Finding out or estimating that number can be tricky sometimes. If we pick a player from a list of all players without regard to anything other than he has a high or low BA, or whatever we happent obe looking for, then we know that the population is ALL BATTERS. It doesn't matter that we are picking a player who has a high or low BA deliberately. There is no "selection bias" as far as the population and its mean is concerned. Remember no matter what criteris we use to choose a player, the population that that player belongs to for purposes of estimating a performance mean that we will regress to, is the group of players that we are slecting FROM, NOT the group of players that we think that our player belongs to (good hitters for example)! If we pick a .320 player from a list of all ML players (or some random sample of all NL players), then that player comes from a population of ALL players and hence the population mean that we regress to is the mean of all ML players.

Now if we find out something else about that player we chose, then all of a sudden we have a differnent population of players and we have to choose a differnent mean BA, which is not all that easy sometimes. For example, if we find out if that player is a RF'er then all of a sudden we have a player who comes from the popualtion of ML RF'ers and NOT all ML players. Obviously the mean BA of all ML RF'ers is different than that of ALL ML players. Same thing if we find out if our player is tall or heavy or LH or RH or fast or slow, etc.

Anyway, for the umpteenth time, that's regression to the mean with regard to baseball players, in a nutshell, for whatever it's worth...

[i]So if Barry Bonds hits .320 for three years in a row, his failure to regress represents luck. But why does that mean that his true score is not .320? Why can't Barry just be a lucky player who happened to hit at his true score level for three straight years?[/i]

See the explantion above. Yes he could be a true .320 player, just like he could be a true .350 player or .280 player. It's just that the best mathematical estimate of his true BA is NOT .320, it is something less, depending upon the mean BA of his population (big, possibly steroid laden, black, LH, RF'er who has a very nice looking swing, has a great reputation, a talented father, etc.) and how many AB's the .320 sample represents...

Whew!


Banner Years

November 1, 2002 - MGL

Tango, why don't you think that a sample mean of any player you choose who has played 5 years with > 299 AB per year will not regress towards 155%? It will, if that 155% is the mean of the population of all such players. Now, in order to get (estimate) the mean of that population, you cannot weight player's numbers. For example, you cannot have more than one Aaron in your group. The population mean must come from a random, non-weighted sample of all players who have played at least 5 years with 300 or more AB per year (or whatever your criteria was). So you must find all players who fit that description and give each player equal weight even though some of those players (Aaron for example) may have many such 5-year spans.

What numbers (let's just call in BA) do you use for those players who appear more than once (have more than one 5-yrar span with 300 or nore AB's each year)? You would take the average BA over all years for that player, as that would represent the best estimate of that player's true BA for any 5-year period...


Banner Years

November 1, 2002 - MGL

BTW, I contradicted myself (in a subtle way) in my second to last post. I said that in order to determine what specific population a players comes from, we look at the "list of players" that we selcted our player FROM. Then I went on to say that if we found out afterwards that a player was tall, we would change our population (from ALL ML players to ALL tall ML players). This appears to be a contradiction, which it is.

Whay I meant was that we can use any characteristic we know about our player (either before or after we chose him) to define or estimate the population of players he comes from. We cannot, however, use his BA (or whatever it is we are trying to regress) to determine what population he comes from (for example, if it is .320, we cannot say "Oh, he must be from a population of good hitters), becuase that is what we are trying to ascertain in the first place (the chances that he IS a good hitter versus the chances that he is a bad or average hitter who got lucky, etc.). It's sort of analagous to the dependent and independent variables in a regression analysis. The characteristics of the player we are regressing (like height, weight, position, etc.) are all the "independent" variables and his BA (or whatever number we are trying to regress) is the dependent variable. The "independent" variables determine the population that he is from for pusposes of figuring out what number we should regress to (the mean BA of that population), while the dependent variable (the sample BA of that players) CANNOT be used to make any inferences about that population (for purposes of establishing a BA to regress to)...


Banner Years

November 2, 2002 - MGL

As far as I know, and I am no math or statistics maven (maybe slightly ahead of an ignoramus but something short of a sciolist), linear algebra is an advanced, college and graduate level, field of mathematics. So anyone who comprehends nothing more than linear algebra is indeed more advanced than I...


Banner Years

November 2, 2002 - MGL

Actually I wanted to add one more thing about "regression" as it relates to projecting talent in bsseball, assuming of course, that not EVERYONE is ignoring this thread now that the cat's out of the bag (that Tango and I are ignoramuses when it comes to statistics).

While the mean of a population from which a player comes determines the upper or lower limit of his true BA (from now on, when I use BA, it is simply a convenient proxy for any metric which measures talent), it isn't that useful in terms of knowing how much to regress a player's sample BA in order to estimate his true BA. In fact, it isn't necessary at all. Nor does the size of the player's sample BA tell us how much to regress, UNLESS AND UNTIL WE KNOW OTHER CHARACTERISTICS OF THE POULATION.

What I mean by that is that there are actually 2 things that tell us exactly how much to regress a player's sample BA to determine the best estimate of his true BA, and one of them is not the mean BA of the population to which the player belongs.

One of those things IS the size of the sample BA (1 year, 4 years, etc.). The other is the DISTRIBUTION OF THE TRUE BA'S OF ALL PLAYERS IN THE POPULATION.

Once we know those 2 things, we can use a precise mathematical formula (it isn't linear algebra, I don't think) to come up with an exact number whihc is the best estimate for that player's true BA.

Let's back up a little. In normal sampling statistics, a player's BA over some time period would be sample of his own true BA and our best estimate of that player's true BA would be exactly his sample BA. So if player A had a .380 average during one month and that's all we knew about this player, regular sampling theory would say that our best estimate of his true BA was .380 and we could use the number of AB's that .320 was based on (the sample size) to determine how "confident" we were that the .320 WAS in fact his real BA, using the standard deviation of BA, which we can compute using a binomial model, etc., etc. Most of you know that.

Now here is where we sort of veer away from a normal "experiment" in sampling statistics, when it comes to baseball players and their talent. We KNOW something about the population of all baseball players, which means, both mathematically and logically, that the .320 sample BA in one month (say 100 AB's) is not necessarily a good estimate of that player's true BA. We know logically that if a player hits .380 in a month that he is NOT a .380 hitter. The only reason we know that, however, is because we know that there is either no such thing as a .380 hitter or at least that a .380 hitter is very rare. If in fact we knew nothing about the range of what ML baseball players usually hit, we would then HAVE TO say that our player was a .380 hitter (within a certain confident interval, which would be around plus or minus 90 points to be 95% confident as the SD for BA in 100 AB's is around 45 points).

So now the question, as always, is, given that our player hit .320 in 100 AB's and given that we KNOW that players rarely if ever have true BA's of .380, what IS the best estimate of our player's true BA (still within the various confidence intervals)?

Let's say the mean BA of the population of ML baseball players (for the same year as our .380 sample) is .270. According to my other posts, that is the number that we regress the .380 towards, and the number of AB's the .380 is based on (100 ) determines how much we regress. Well, the first part is always true (the .270 is the lower limit of our player's true BA), but the second part is only true given a certain set of characteristics of the population of baseball players. IOW, it is these characteristics that FIRST determine how much we regress the .380 toeards the .270. Once we establish those characteristics, the more sample AB's we have, the more we regress.

What are those characteristics we need to determine before we can figure out how much to regress the .380 towards the .270? It is the percentage of batters in the population (ALL ML players in this case, since we know nothing about our .380 hitter other than he is a ML player) who have various true BA's. IOW, we need to know how many ML players are true .210 hitters, how many are true .230 hitter, true .320 hitters. etc. Obviously, there is a whole continuum of true BA's among ML players, but it would suffice for this kind of analysis if we estimated the number of players in each range. Now, estimating the number of players in baseball for each range of true B'A's is not easy to do and is a little curcuitous as well. The only wayt to do that is to look historically at players who have had a long career and assume that their lifetime BA is is their true BA. Of course, even that lifetime BA would have to be regressed in order to get their true BA, so that's where the "curcuitous logic" comes from - "in order to know how much to regress a sample BA, we have to find out the true BA's of ML players and in order to find out those true BA's we have to know how much to regress a player's lifetime BA..."

We have other problems in terms of trying to figure out hoe many players in ML baseball have true BA's of x. For example, not many players who have true BA's of .210 have long careers, so if we only loked at long careers to establish our percentages, we might miss some types of players (those with very low true BA's). In any casze, let's assume that we can cone up with a fairly good table of frequencies for the true BA's of all ML players. It might look something like <.200 (.1%), .200-220 (1%), .220-.230 (3%),..., .300-.320 (2%), etc.

NOW we can use Baysean (lower on the total pole than linear algebra) probability to figure our .380 player's true BA! The way we do that goes something like this:

What are the chances that our player is a true .200-.220 hitter (1% if we know nothiong else about this hitter other than he is a ML player) GIVEN THAT


Banner Years

November 2, 2002 - MGL

Actually I wanted to add one more thing about "regression" as it relates to projecting talent in bsseball, assuming of course, that not EVERYONE is ignoring this thread now that the cat's out of the bag (that Tango and I are ignoramuses when it comes to statistics).

While the mean of a population from which a player comes determines the upper or lower limit of his true BA (from now on, when I use BA, it is simply a convenient proxy for any metric which measures talent), it isn't that useful in terms of knowing how much to regress a player's sample BA in order to estimate his true BA. In fact, it isn't necessary at all. Nor does the size of the player's sample BA tell us how much to regress, UNLESS AND UNTIL WE KNOW OTHER CHARACTERISTICS OF THE POULATION.

What I mean by that is that there are actually 2 things that tell us exactly how much to regress a player's sample BA to determine the best estimate of his true BA, and one of them is not the mean BA of the population to which the player belongs.

One of those things IS the size of the sample BA (1 year, 4 years, etc.). The other is the DISTRIBUTION OF THE TRUE BA'S OF ALL PLAYERS IN THE POPULATION.

Once we know those 2 things, we can use a precise mathematical formula (it isn't linear algebra, I don't think) to come up with an exact number whihc is the best estimate for that player's true BA.

Let's back up a little. In normal sampling statistics, a player's BA over some time period would be sample of his own true BA and our best estimate of that player's true BA would be exactly his sample BA. So if player A had a .380 average during one month and that's all we knew about this player, regular sampling theory would say that our best estimate of his true BA was .380 and we could use the number of AB's that .320 was based on (the sample size) to determine how "confident" we were that the .320 WAS in fact his real BA, using the standard deviation of BA, which we can compute using a binomial model, etc., etc. Most of you know that.

Now here is where we sort of veer away from a normal "experiment" in sampling statistics, when it comes to baseball players and their talent. We KNOW something about the population of all baseball players, which means, both mathematically and logically, that the .320 sample BA in one month (say 100 AB's) is not necessarily a good estimate of that player's true BA. We know logically that if a player hits .380 in a month that he is NOT a .380 hitter. The only reason we know that, however, is because we know that there is either no such thing as a .380 hitter or at least that a .380 hitter is very rare. If in fact we knew nothing about the range of what ML baseball players usually hit, we would then HAVE TO say that our player was a .380 hitter (within a certain confident interval, which would be around plus or minus 90 points to be 95% confident as the SD for BA in 100 AB's is around 45 points).

So now the question, as always, is, given that our player hit .320 in 100 AB's and given that we KNOW that players rarely if ever have true BA's of .380, what IS the best estimate of our player's true BA (still within the various confidence intervals)?

Let's say the mean BA of the population of ML baseball players (for the same year as our .380 sample) is .270. According to my other posts, that is the number that we regress the .380 towards, and the number of AB's the .380 is based on (100 ) determines how much we regress. Well, the first part is always true (the .270 is the lower limit of our player's true BA), but the second part is only true given a certain set of characteristics of the population of baseball players. IOW, it is these characteristics that FIRST determine how much we regress the .380 toeards the .270. Once we establish those characteristics, the more sample AB's we have, the more we regress.

What are those characteristics we need to determine before we can figure out how much to regress the .380 towards the .270? It is the percentage of batters in the population (ALL ML players in this case, since we know nothing about our .380 hitter other than he is a ML player) who have various true BA's. IOW, we need to know how many ML players are true .210 hitters, how many are true .230 hitter, true .320 hitters. etc. Obviously, there is a whole continuum of true BA's among ML players, but it would suffice for this kind of analysis if we estimated the number of players in each range. Now, estimating the number of players in baseball for each range of true B'A's is not easy to do and is a little curcuitous as well. The only wayt to do that is to look historically at players who have had a long career and assume that their lifetime BA is is their true BA. Of course, even that lifetime BA would have to be regressed in order to get their true BA, so that's where the "curcuitous logic" comes from - "in order to know how much to regress a sample BA, we have to find out the true BA's of ML players and in order to find out those true BA's we have to know how much to regress a player's lifetime BA..."

We have other problems in terms of trying to figure out hoe many players in ML baseball have true BA's of x. For example, not many players who have true BA's of .210 have long careers, so if we only loked at long careers to establish our percentages, we might miss some types of players (those with very low true BA's). In any casze, let's assume that we can cone up with a fairly good table of frequencies for the true BA's of all ML players. It might look something like <.200 (.1%), .200-220 (1%), .220-.230 (3%),..., .300-.320 (2%), etc.

NOW we can use Baysean (lower on the total pole than linear algebra) probability to figure our .380 player's true BA! The way we do that goes something like this:

What are the chances that our player is a true .200-.220 hitter (1% if we know nothiong else about this hitter other than he is a ML player) GIVEN THAT he hit .380 in 100 AB's (much less than 1% of course)? What are the chances that he is a .300-.320 hitter given that he hit .380, etc (more than 2% of course)?...

Do all the multiplication and addition (arithmetic, MUCH lower than linear algebra) and voila we come up with an exact number (true BA) for our .380 hitter (which still has around a 90 point either way 95% confident interval).

Remember that the mean BA doesn't tell us anything about how much to regress or what the final estimate of the true BA of our .380 hitter is; it only tells us the limit of the regression, and in fact, we don't even need to know that number, as in the calculations above. For example, let's say that the mean BA for all ML players were .270, as in the above, bu that all ML players ahd the same true BA. The true BA for our .380 hitter or ANY hitter with any sample BA in any number of AB's would be .270. Let's say that 1% of all ML players had true BA's of .380 and 99% had true BA's of .290. What would our .380 player's true BA be?

It is either .380 or .290, so it's not really a "fair" question. We could answer it in 2 ways. One would be that "there is an X percent chance that he is a .290 hitter (who got lucky in 100 AB's) and a Y percent chance that he is a .380 hitter (who hit what he was "supposed to" in 100 AB's). The other answer is that he is a .zzz hitter, where the .zzz is X percent times .270 plus Y percent times .380, divided by 100. Here's how we would do that calculation:

The chances of a .290 hitter hitting .380 or better is .025 (.380 is 2 standard deviations above .290 for 100 AB's). The chances of a .380 hitter hitting .380 or better is .5. So if we didn't know anything about the frequencies of .290 or .380 hitters in our population, our player is 20 times more likely to be a .380 hitter than a .290 hitter (.5/.025), or he has a 95.24% chance of being a .380 hitter. But since 99% of all players are .290 hitters and only 1% are .380 hitters, we now have the chances that our player is a .380 hitter at 20%, rather than the initial 1%. So we can say that our hitter has a 17% chance of being a .380 hitter and an 83% chance of being a .290 hitter or we can say that our hitter is a .305 hitter. We get the 20% chance of our hitter being a .380 hitter by the following Bayesian formula: The ratio of the chance of being a .290 hitter who hit .380 or better (.99 times .025 or .02475) to the chance of being a .380 hitter who hit .380 or better (.01 times .5 or .005), is 4.95 to 1. That means that it is 4.95 more likely that our .380 hitter is a true .290 hitter who got lucky, so the chances of our hitter being a .290 hitter is .8319, and hence .1681 for being a .380 hitter.

That same above Bayesian calculation would apply for any number of categories of true BA's in the population and the percentage of players in each category.

Now, given the difficulty in determining the categories and frequencies for true BA's in the population of ML baseball players and given the cumbersome nature of the ensuing Bayesian calculations, we can forgoe all of that by using a linear regression formula to approximate the same results. If we used a single regression formula for say the above example (a player who hits .380 in 100 AB's), we would take a bunch of data points constituting all players with a certain BA in 100 AB's (our independent variable) and regress this on those same players' BA for the next year or preferably multiple years. As usual, this will yield two coefficients, A and B in our y-Ax+B linear equation, and B will be colse to the mean BA of all baseball players (actualy the mean BA in our sample group we are using in the regression analysis). Remember that these coefficients will only work for 100 AB's. If we want to do the sem thing for a player with 500 AB's, we have to do a new regression analysis and derive a new equation, OR we can do a multiple regression analysis where number of AB's is one of the independent variables. Unfortunately, due to my status as a statistics ignoramus, I don't know wether there is a linear relationship if we include # of AB's (I don't think there is), in which case you would have to do a non-linear analysis, which is beyond my abilities...


Banner Years

November 3, 2002 - MGL

In case there is anyone on the planet still reading this thread, the 4th sentence in the third paragraph from the bottom should read (among lots of other spelling and grammar errors):

But since 99% of all players are .290 hitters and only 1% are .380 hitters, we now have the chances that our player is a .380 hitter at 17% (NOT 20%), rather than the initial 1%.


Banner Years

November 6, 2002 - MGL

Brother of Jessie,

Sorry but your argument is mathematically (statistically) unsound. I don't have the time right now to explain why.

BTW, Tango's 149 149 149 149 142 observation is not a revelation. It doesn't "need" explaining nor is it open to "criticsm".

It is a mathematical certainty that no matter what the distribution of true linear weights ratios in the population of baseball players is, any player or players who show an above or below average in any number of years, will "regress" towards the mean (100 in this case) in any subsequent year. How much they regress (in percentage, like the 7/49, if you want to put it that way) depend entirely on how many PA's the historical data is comprised of. In this case, 4 years of 149 regressed to 142 in the 5th year. One year of 149 will regress to, I don't know, something like, 120. Tango's observations were just to make sure that nothing really funny (like the statistician's view of the world is completely F'uped) was going on. We don't need to look at ANY data to tell us that 4 149's will be followed by something less than but closer to the 149, or that 2 149's will be followed by something even more less than and less close to 149. Again, it is a mathematical certainty, even if there is lots of learning and developing going on with baseball players. The learning and developing can only decrease the amount of regression; it cannot eliminate it! Of course, what we will and do find if we look at real-life data, is that this learning and developing (to the extent that people "read into" these banner years) is small or non existent beyond the normal or average age progression. The reason we know this is that if the learning and developing were a large or even moderate factor, we would see much smaller regressions after banner years than we do. The regressions we do see comport very nicely to what a statistical model would predict if no learning and developing were going on. Given that, there can be only one conclusion - THAT A BANNER YEAR IS 99 PARTS FLUCTUATION AND 1 PART LEARNING AND DEVELOPMENT (i.e, the concept of a "breakout year" is a FICTION!)


Banner Years

November 7, 2002 - MGL

Mr. James,

You need to read my protracted discussion on "regression" as it applies to baseball talent. I think it is in this thread, but I'm not sure.

Despite your moniker, you got no shot to "shoot me down" on this one!

A player's stats gets regressed to the mean of the population that he comes from. Yes, if we assemble ALL-STARS and choose players from that group, the mean is greater than 100. Same is true if we assemble right fielders and choose a player or players from that group. If we assemble a group of sub 6-foot players, our mean will probably be less than 100.

Tango looked at all ML players and chose those who had high lwts ratios (an average of 149) for 4 straight years. EVEN THOUGH THESE ARE OBVIOUSLY GREAT PLAYERS, THEY CAME FROM A GROUP OF ALL ML PLAYERS. That is why you regress toward the mean of all ML players (actually you regress toward the mean of all ML players who have played at least 5 years). If you assemble All Stars and choose from that group, THERE HAS TO BE AN INDEPENDENT REASON FOR YOU CALLING THEM ALL-STARS, OTHER THAN THE CRITERIA YOU USED TO SELECT PLAYERS FROM THAT GROUP. IOW, in order to regress those same 149 players to a mean greater than 100, you would need to assemble your ALL STARS first by using some creiteris independent of having a high lwts ratio for 4 straight years - say a high lwts ratio for the previous 3 years. If you do that, then you regress toward the mean of all players who have had high lwts ratios for 3 straight years and have played for at least 5 more years. Get it!

You regress towards the mean of the population that your player comes from! You cannot make any inferences about that population based upon you rsampling criteria! That's the whole point of regression!

Read this - it is important:

To put it another way, in Tango's example, he looks at the entire population of baseball players. They include all players of all true talent. They have a certain mean lwts ratio, which we can easily measure, and of course, we define it as 100. Next he ignores those players who have not had a minimum number of PA's for at least 5 years, right? So now we have a population of players who have had a min # of PA's for 5 straight years. We take the mean of that population, which is probably higher than 100 (say 105). That is the number we regress to! The fact that we now select only those players who have had at least a 125% lwts ratio for 4 straight years DOES NOT CHANGE THE POPULATION THAT THESE PLAYERS WERE SELECTED FROM. That is the popultion whose mean we rregress to! Yes, that group of > 125% players are ALL-STARS as a group. Their true lwts ratio is much greater than 105, but it is NOT 149, as we can see from the 5th year ratio of 142. By definition, when we regress to the mean (this is not my "opinion" it is a rule of statistics), we regress to the mean of the poulation from which we chose our players, regardless of what criteria we used to select those players. By criteria, I mean "What range of lwts ratios?", like the > 125% that Tango chose. If we choosze criteria (these are independent variables) like what position, or what weight, OR WHAT WAS THEIR LWTS RATIO IN THE PRIOR YEAR OR YEARS, then we have a new population and hence, a new mean to regress to. Anyway, I got off on a tangent as far as the important thing to read...

When Tango chose those players who had ratios above 125% for 4 straight years, the reason we regress at all is that those playres selected consist of: 1) players with a true ratio of less than 149 who got lucky, 2) those players with a true ratio around 149, the sample mean, and 3) those players who have a true ratio GREATER than 149. We don't know in what proportion they exist, but even though it is more likely that a player who has a sample mean of 149 is a true 149 player, and it is less likely that he is a less than or greater than 149 true player, there are many, many more players in our original group (our population) that were true sub-149 players, so it is much more likely that an "average" player in our 149 sample group of players is a true sub-149 player who got lucky. It just so happens that the proper mean to regress to is the mean of the original group, whatever that is (105?).

If we had chosen a group of ALL-STARS, based on, say, 3 years worth of lwts ratios above 125%, we now have a group whose true lwts ratios is around 135 or so. Now, if FROM THAT GROUP, we select those players who have had 4 MORE year of > 125%, then we have the same experiment as Tango's, except that that group is ALREADY made up of players who have a true ratio of around 140, as opposed to in Tango's example, the group he selcted from are ALL ML players who have played for 5 years (etc.). They only have a mean ratio of around 105. So in the second experiment, where we choose from a group of KNOWN good players (on the average, not all of them), many more of the players we select are good players who did not get lucky or got a little lucky for the next 5 years (after the initial 3 years of > 125% performance). Many more (percentage-wise) in the second group, as oposed to the first group, are also true > 149 players who got unlucky. That's why the true ratio in the second group is more than 142 (probably 145 or so). It is still not 149, since the mean of the group of ALL-STARS is only around 135, so we still have to regress the 149 sample mean to 135. The "reason" for this is that we still have some lingering players who are not very good, but managed to make it into the ALL-STAR group through luck, and ALSO managed to make it into the > 125% for the next 5 years group. Obviously not many players will make it through these 2 hurdles, but as long as there is a finite chance that any true sub-149 player will make it, the true ratio of the 149 group will ALWAYS be less than 149! You may say, wait, there is an equal chance that a > 149 players made it through both groups as a sub-149 player, so they would cancel each other out! In fact that is true! There is an equal likelihood that any player in our 149 group is a true 154 or a true 144 (each one is 5 points different from the mean). But here is the kicker that makes us regress downward in either experiment: There are many more players in either population who are true sub 149 players than there are who are > 149 players, so an average player in our 149 group is MORE likely to be a 144 player who got lucky than a 154 player who got unlucky, simply becuase there are more 144 players!

Now if we chose our ALL-STARS such that our estimate of the average true ratio in that group of ALL-STARS were 155 (let's say we selected all players who had a ratio for 3 years of greater than 140 - not too many players, of course), then if we did the same experiment, and the sample ratio of players who had > 125% for the next 4 years was stil 149, we would eactually regress upwards such that our estimate of the 149 group's true mean ratio would be like 153 or so!

I hope this explains everything, because I just missed my appointment at the gym!


Banner Years

November 7, 2002 - MGL

I'm done trying to explain how it works with baseball talent (or any similar experiment). Either we are misunderstanding one another or you are very stubborn or both. Maybe someone else can explain it to you or maybe we can just drop it.

If a sample of players (yes NOT a random sample), using a measurement criteria (like above 125% for 4 straguiht years), drawn from the population of baseball players DID NOT regress toward the mean then you would NOT see the 5th year at 142 - you would see it at 149, right?

Do an experiment which should make everything obvious - in fact, you don't even have to do the experiment - the results hsould be obvious:

Look at all BB players from 2001. Take only those who had an OPS above .900 (non-random sample - obviously). Their average (mean) OPS is something like .980. What do you think their OPS as a group is in 2002? It is the .980 regressed toward the mean of the entire population of baseball playersr (around .770), which will be maybe .850 or .880. The 2002 OPS is also the best estimate of these players' true OPS (given a large enough sample of players, the best estimate of those players' average true OPS is the next year's - or any large random sample - OPS). We KNOW this right? We know that we take any non-random sample of players defined by a certain performance (less than .700 OPS, greater than .800, etc.), their subsequent (or past) OPS will REGRESS (toward the mean of the whole population of BB players)! That is regression towards the mean (for an excperiment like this)! There does not have to be random sampling, although the exact same thing would happen if we took a random sample!

What do you think would "happen" (what would the future look like) if we looked at all those players who had an OPS of over 1.000 for one week? (See my thread on hot and cold streaks.) Would their future (or past) OPS regress towards the mean or wouldn't it? Would their average OPS (of around 1.100) remain the same? Do you not understand what I am getting at here? What do you think the true talent (OPS) of these one-week 1.100 guys is? I know you don't think it is 1.100, which means they will continue at a 1.100 clip. I know you know that it will continue at a pace closer to the league average (probably around .880). What do you call that other than regression to the mean?

(Light bulb went off in my head!) Now I see what you are saying! My apologies. Yes technically, these "higher than average groups" (the 149 for 4 straight years guys or the better than 1.000 OPS for one week guys) will regress toward THEIR mean lwts ratio or OPS and NOT the mean of the league as a whole. Yes that is true and I think that is what you are trying to say. Again, my apologies. You are absolutely correct. IN PRACTICE, you can use the mean of the whole league to regress to, because you don't know what the mean of the group you selected is - in fact, that is what you are trying to figure out. IOW, if we take the 149 ratio guys and want to figure out their true ratio or what their ratio will be in the next year (same thing, basically), then technically, we must use their true ratio to regress to, but that's what we are trying to figure out in the first place - what that true ratio is. SInce we don't know that and the onlything we know is the mean ratio of all players, then we have to regress that 149 towards that all player mean of 1056 or whatever it is. Yes, technically that 149 doesn't get regressed towards 105. It gets regressed towards ssomething less than 149 and mroe than 105, but since we don';t know what that is, it LOOKS like it gets regressed towards the 105.

Anyway, there is no argument anymore, unless you think that the true OPS of that 149 group is 149, rather than something like 142 (the 149 regressed towards the true ratio). IF you do, you must wonder why the next year comes out to 142. If you do, you must think that one year of a more than .900 OPS is a player's true OPS, again, in which case you must wonder why these guys show a .800 OPS or so the next year. And if you do, you must certainly wonder why a group that shows a 1.100 in a week does not continue at that clip, even though we did not randomly select this group of players (we only looked at players who had greater than a 1.000 OPS in a certain week)...

CHeers!


Banner Years

November 8, 2002 - MGL

I'm gonna try one more time! Forget my last post!

Forget about the expression "regression to the mean". It is a generic expression which could have different meanings depending upon the context. Pretend it doesn't exist.

Remember that when I say that a value (call it value 1) gets "regressed" to another value (call it value 2), THAT MEANS two things and two things only:

1) Value 2 represents the direction in which we move value 1 (it can be moved either up or down).

2) If we don't know how much to move value 1 (which we usually don't in these types of experiments), value 2 represents the limit of the movement.

For example, if value 1 is 149 and value 2 is 135, we know 2 and only 2 things, assuming that we have to "move" value 1. One, we move it down (towards the 135), and two, we move it a certain unknown amount but we never move it past 135.

How does this very vague above concept apply to baseball players? I'm glad you asked.

First, I am going to call "value 2", as described above, "the mean" and I am going to substitute the word "regress" for the word "move" as used in the context above. This is literally an arbitrary choice of words. We might as well say "wfogbnlnfl to the slkvdn". I'm using the word "mean", not in any mathematical sense, but to represent the "limit of how much we move value 1". Likewise, I am using the word "regress", also not in any mathematical sense, but purely as a substitute for the word "move".

So "regression to the mean" from now on simply means "We move value 1 either up or down, depending on whether value 2 is less than or greater than value 1, and value 2 also represents the most we can move value 1."

Now here are some experiments in which I will attempt apply the above methodology. You tell me whether it should be applied or not (and if not, why). If you think that it is appropriate to apply, you must also tell me what value we should use for value 2. The correct answers will appear at the end of this post.

Experiment #1:

Let's say that we have a population of BB players and we don't know whether all players in that population have the same true OPS or not. Either everyone has the same true OPS (like a population of all different denominations of coins, where every coin has the same true H/T "flipping" ratio), or some or all of the players have different true OPS's (like if we had a population of coins and some coins were 2-sided, while others were 3-sided or 4-sided, etc.).

Now let's say that we randomly sample 100 of these players and look at a 1-year sample of each player's lwts ratio. We basically have 100 "man-years" of data that were randomly sampled (not exactly, but close enough) from a population of all baseball players.

Let's say that the mean OPS these 100 players is .780. This is our value 1, by the way. Let's also say that WE DO NOT KNOW WHAT THE MEAN OPS OF THE POPULTION OF ALL PLAYERS IS. Remember that we randomly sampled these 100 players and 1 year's worth of data for each player from a population of all players and all years.

What is the best estimate of the true OPS of these 100 players, given that the average OPS for all 100 players for 1 year each is .780?

In order to arrive at that estimate, did you need to determine a value 2 and did you need to move value 1 (.780) in the direction of value 2 and does value 2 represent the furthest you should move value 1? If the answer is yes to all 3 related questions, how did you arrive at value 2? If the answer is yes to some and no to some (of the above 3 questions), please explain.

Experiment #2:

Same as experiment #1, but we now know (G-d tells us) that the mean OPS of the population of all players is .750 AND we know that all players have the same true OPS.

Again, what is the best estimate of the true OPS of our 100 players chosen randomly (and their 1-year stats each)? This is a no-brainer right? It is not a trick question. The answer is as obvious as it seems.

Given your answer to the above question, did you move value 1 (the .780), is there a value 2 (and if so, what is it), and if the answer to both questions is yes, do we know exactly how much to move value 1 ("towards" value 2)? IOW, is there regression to the mean (remember my above definition - movement, direction, and limit, where "regression to" is "movement towards" and "the mean" is "value 2")?

Experiment #3: (We are getting closer and closer to Tango's experiment)

Same as above (#2) only this time not only do we know that the mean OPS of all players is .750, we also know (again from G-d, not from any empirical data) that all players in the population have different true OPS's. IOW, some players have true OPS' of .600, some .720, some .850, some .980, etc. In this experiment we don't know, nor does it matter, what percentage of players have true OPS's of x, what percentage have true OPS's of y, etc. We only know that different players have different true OPS's. So in our random sample of 100 players, each player could have a true OPS of anything, assuming that every OPS is represented in the population in unknown proportions.

Now, rememeber, like the previous experiments, the mean OPS of our 100 randomly selected players for 1-year each (at this point it doesn't matter that we used 1-year data for each player. We could have used 2-year or 6-months), is .780. Remember also that we KNOW the true average (mean) OPS of all the players is .750. And don't forget (this is what makes this experiment different from #2) that we KNOW that different players in the population have different true OPS's, of unknown values and in unknown proportions (again, the last part - the "unknown values and in unknown proportions" - doesn't matter yet).

So now what is your best estimate of the true average (mean) OPS of the 100 players? Is this an exact number? Do we use a "regression to the mean"? If yes, what is value 2 and do we know exactly how much to move (regress) value 1 (again, the .780) towards value 2?

Here are the answers to the questions in experiments 1-3:

Experiment #1 (answer):

The best estimate of the average true OPS of the 100 players is .780, the same as their sample OPS. There is no "regression to the mean". There is no value 2; therefore there is no movement from value 1. (Technically, we could say that value 2 is .780 also, the same as value 1, and that we regress value 1 "all the way" towards value 2.) The above comes from the following rule in sampling statistics:

When we sample a population (look at the 1-year OPS of 100 players) and we know nothing about the characteristics of the population, as far as the variable we are sampling (OPS) is concerned, the sample mean (.780) is the best estimate of the population mean.

Experiment #2 (answer):

The answer is that no matter what the sample OPS is (in this case .780), the true OPS of any and all players (including the average of our 100 players) is .750! This is simply because we are told that the true OPS of all players is .750! Any sample that yields an OPS of anything other than .750 MUST BE DUE TO SAMPLING ERROR


Banner Years

November 8, 2002 - MGL

I'm gonna try one more time! Forget my last post!

Forget about the expression "regression to the mean". It is a generic expression which could have different meanings depending upon the context. Pretend it doesn't exist.

Remember that when I say that a value (call it value 1) gets "regressed" to another value (call it value 2), THAT MEANS two things and two things only:

1) Value 2 represents the direction in which we move value 1 (it can be moved either up or down).

2) If we don't know how much to move value 1 (which we usually don't in these types of experiments), value 2 represents the limit of the movement.

For example, if value 1 is 149 and value 2 is 135, we know 2 and only 2 things, assuming that we have to "move" value 1. One, we move it down (towards the 135), and two, we move it a certain unknown amount but we never move it past 135.

How does this very vague above concept apply to baseball players? I'm glad you asked.

First, I am going to call "value 2", as described above, "the mean" and I am going to substitute the word "regress" for the word "move" as used in the context above. This is literally an arbitrary choice of words. We might as well say "wfogbnlnfl to the slkvdn". I'm using the word "mean", not in any mathematical sense, but to represent the "limit of how much we move value 1". Likewise, I am using the word "regress", also not in any mathematical sense, but purely as a substitute for the word "move".

So "regression to the mean" from now on simply means "We move value 1 either up or down, depending on whether value 2 is less than or greater than value 1, and value 2 also represents the most we can move value 1."

Now here are some experiments in which I will attempt apply the above methodology. You tell me whether it should be applied or not (and if not, why). If you think that it is appropriate to apply, you must also tell me what value we should use for value 2. The correct answers will appear at the end of this post.

Experiment #1:

Let's say that we have a population of BB players and we don't know whether all players in that population have the same true OPS or not. Either everyone has the same true OPS (like a population of all different denominations of coins, where every coin has the same true H/T "flipping" ratio), or some or all of the players have different true OPS's (like if we had a population of coins and some coins were 2-sided, while others were 3-sided or 4-sided, etc.).

Now let's say that we randomly sample 100 of these players and look at a 1-year sample of each player's lwts ratio. We basically have 100 "man-years" of data that were randomly sampled (not exactly, but close enough) from a population of all baseball players.

Let's say that the mean OPS these 100 players is .780. This is our value 1, by the way. Let's also say that WE DO NOT KNOW WHAT THE MEAN OPS OF THE POPULTION OF ALL PLAYERS IS. Remember that we randomly sampled these 100 players and 1 year's worth of data for each player from a population of all players and all years.

What is the best estimate of the true OPS of these 100 players, given that the average OPS for all 100 players for 1 year each is .780?

In order to arrive at that estimate, did you need to determine a value 2 and did you need to move value 1 (.780) in the direction of value 2 and does value 2 represent the furthest you should move value 1? If the answer is yes to all 3 related questions, how did you arrive at value 2? If the answer is yes to some and no to some (of the above 3 questions), please explain.

Experiment #2:

Same as experiment #1, but we now know (G-d tells us) that the mean OPS of the population of all players is .750 AND we know that all players have the same true OPS.

Again, what is the best estimate of the true OPS of our 100 players chosen randomly (and their 1-year stats each)? This is a no-brainer right? It is not a trick question. The answer is as obvious as it seems.

Given your answer to the above question, did you move value 1 (the .780), is there a value 2 (and if so, what is it), and if the answer to both questions is yes, do we know exactly how much to move value 1 ("towards" value 2)? IOW, is there regression to the mean (remember my above definition - movement, direction, and limit, where "regression to" is "movement towards" and "the mean" is "value 2")?

Experiment #3: (We are getting closer and closer to Tango's experiment)

Same as above (#2) only this time not only do we know that the mean OPS of all players is .750, we also know (again from G-d, not from any empirical data) that all players in the population have different true OPS's. IOW, some players have true OPS' of .600, some .720, some .850, some .980, etc. In this experiment we don't know, nor does it matter, what percentage of players have true OPS's of x, what percentage have true OPS's of y, etc. We only know that different players have different true OPS's. So in our random sample of 100 players, each player could have a true OPS of anything, assuming that every OPS is represented in the population in unknown proportions.

Now, rememeber, like the previous experiments, the mean OPS of our 100 randomly selected players for 1-year each (at this point it doesn't matter that we used 1-year data for each player. We could have used 2-year or 6-months), is .780. Remember also that we KNOW the true average (mean) OPS of all the players is .750. And don't forget (this is what makes this experiment different from #2) that we KNOW that different players in the population have different true OPS's, of unknown values and in unknown proportions (again, the last part - the "unknown values and in unknown proportions" - doesn't matter yet).

So now what is your best estimate of the true average (mean) OPS of the 100 players? Is this an exact number? Do we use a "regression to the mean"? If yes, what is value 2 and do we know exactly how much to move (regress) value 1 (again, the .780) towards value 2?

Here are the answers to the questions in experiments 1-3:

Experiment #1 (answer):

The best estimate of the average true OPS of the 100 players is .780, the same as their sample OPS. There is no "regression to the mean". There is no value 2; therefore there is no movement from value 1. (Technically, we could say that value 2 is .780 also, the same as value 1, and that we regress value 1 "all the way" towards value 2.) The above comes from the following rule in sampling statistics:

When we sample a population (look at the 1-year OPS of 100 players) and we know nothing about the characteristics of the population, as far as the variable we are sampling (OPS) is concerned, the sample mean (.780) is the best estimate of the population mean.

Experiment #2 (answer):

The answer is that no matter what the sample OPS is (in this case .780), the true OPS of any and all players (including the average of our 100 players) is .750! This is simply because we are told that the true OPS of all players is .750! Any sample that yields an OPS of anything other than .750 MUST BE DUE TO SAMPLING ERROR


Banner Years

November 8, 2002 - MGL

I'm gonna try one more time! Forget my last post!

Forget about the expression "regression to the mean". It is a generic expression which could have different meanings depending upon the context. Pretend it doesn't exist.

Remember that when I say that a value (call it value 1) gets "regressed" to another value (call it value 2), THAT MEANS two things and two things only:

1) Value 2 represents the direction in which we move value 1 (it can be moved either up or down).

2) If we don't know how much to move value 1 (which we usually don't in these types of experiments), value 2 represents the limit of the movement.

For example, if value 1 is 149 and value 2 is 135, we know 2 and only 2 things, assuming that we have to "move" value 1. One, we move it down (towards the 135), and two, we move it a certain unknown amount but we never move it past 135.

How does this very vague above concept apply to baseball players? I'm glad you asked.

First, I am going to call "value 2", as described above, "the mean" and I am going to substitute the word "regress" for the word "move" as used in the context above. This is literally an arbitrary choice of words. We might as well say "wfogbnlnfl to the slkvdn". I'm using the word "mean", not in any mathematical sense, but to represent the "limit of how much we move value 1". Likewise, I am using the word "regress", also not in any mathematical sense, but purely as a substitute for the word "move".

So "regression to the mean" from now on simply means "We move value 1 either up or down, depending on whether value 2 is less than or greater than value 1, and value 2 also represents the most we can move value 1."

Now here are some experiments in which I will attempt apply the above methodology. You tell me whether it should be applied or not (and if not, why). If you think that it is appropriate to apply, you must also tell me what value we should use for value 2. The correct answers will appear at the end of this post.

Experiment #1:

Let's say that we have a population of BB players and we don't know whether all players in that population have the same true OPS or not. Either everyone has the same true OPS (like a population of all different denominations of coins, where every coin has the same true H/T "flipping" ratio), or some or all of the players have different true OPS's (like if we had a population of coins and some coins were 2-sided, while others were 3-sided or 4-sided, etc.).

Now let's say that we randomly sample 100 of these players and look at a 1-year sample of each player's lwts ratio. We basically have 100 "man-years" of data that were randomly sampled (not exactly, but close enough) from a population of all baseball players.

Let's say that the mean OPS these 100 players is .780. This is our value 1, by the way. Let's also say that WE DO NOT KNOW WHAT THE MEAN OPS OF THE POPULTION OF ALL PLAYERS IS. Remember that we randomly sampled these 100 players and 1 year's worth of data for each player from a population of all players and all years.

What is the best estimate of the true OPS of these 100 players, given that the average OPS for all 100 players for 1 year each is .780?

In order to arrive at that estimate, did you need to determine a value 2 and did you need to move value 1 (.780) in the direction of value 2 and does value 2 represent the furthest you should move value 1? If the answer is yes to all 3 related questions, how did you arrive at value 2? If the answer is yes to some and no to some (of the above 3 questions), please explain.

Experiment #2:

Same as experiment #1, but we now know (G-d tells us) that the mean OPS of the population of all players is .750 AND we know that all players have the same true OPS.

Again, what is the best estimate of the true OPS of our 100 players chosen randomly (and their 1-year stats each)? This is a no-brainer right? It is not a trick question. The answer is as obvious as it seems.

Given your answer to the above question, did you move value 1 (the .780), is there a value 2 (and if so, what is it), and if the answer to both questions is yes, do we know exactly how much to move value 1 ("towards" value 2)? IOW, is there regression to the mean (remember my above definition - movement, direction, and limit, where "regression to" is "movement towards" and "the mean" is "value 2")?

Experiment #3: (We are getting closer and closer to Tango's experiment)

Same as above (#2) only this time not only do we know that the mean OPS of all players is .750, we also know (again from G-d, not from any empirical data) that all players in the population have different true OPS's. IOW, some players have true OPS' of .600, some .720, some .850, some .980, etc. In this experiment we don't know, nor does it matter, what percentage of players have true OPS's of x, what percentage have true OPS's of y, etc. We only know that different players have different true OPS's. So in our random sample of 100 players, each player could have a true OPS of anything, assuming that every OPS is represented in the population in unknown proportions.

Now, rememeber, like the previous experiments, the mean OPS of our 100 randomly selected players for 1-year each (at this point it doesn't matter that we used 1-year data for each player. We could have used 2-year or 6-months), is .780. Remember also that we KNOW the true average (mean) OPS of all the players is .750. And don't forget (this is what makes this experiment different from #2) that we KNOW that different players in the population have different true OPS's, of unknown values and in unknown proportions (again, the last part - the "unknown values and in unknown proportions" - doesn't matter yet).

So now what is your best estimate of the true average (mean) OPS of the 100 players? Is this an exact number? Do we use a "regression to the mean"? If yes, what is value 2 and do we know exactly how much to move (regress) value 1 (again, the .780) towards value 2?

Here are the answers to the questions in experiments 1-3:

Experiment #1 (answer):

The best estimate of the average true OPS of the 100 players is .780, the same as their sample OPS. There is no "regression to the mean". There is no value 2; therefore there is no movement from value 1. (Technically, we could say that value 2 is .780 also, the same as value 1, and that we regress value 1 "all the way" towards value 2.) The above comes from the following rule in sampling statistics:

When we sample a population (look at the 1-year OPS of 100 players) and we know nothing about the characteristics of the population, as far as the variable we are sampling (OPS) is concerned, the sample mean (.780) is the best estimate of the population mean.

Experiment #2 (answer):

The answer is that no matter what the sample OPS is (in this case .780), the true OPS of any and all players (including the average of our 100 players) is .750! This is simply because we are told that the true OPS of all players is .750! Any sample that yields an OPS of anything other than .750 MUST BE DUE TO SAMPLING ERROR, by definition. It told you this one was a no-brainer! In this experiment, there is "regression to the mean" (again, per my definition - re-read it if you forgot what it was). Value 1 (.780) gets moved towards value 2, (.750). It just so happens that we know exactly how much to move it (all the way). Value 2 is still the limit on how much we can move value 1 in order to estimate the average true OPS of the sample group of players. And in this case, value 2 is equal to THE MEAN OF THE POPULATION! How do you like that? In this experiment, regression to the "mean" is really to the "MEAN"!

Experiment #3 (answer):

Remember we still know the population mean (.750). This time, however, not only are we not told that all players have the same true OPS, we are told that they definitely don't. The answer is that the best estimate of the true OPS of our 100 player sample (with a sample average OPS os .780) is something less than .780 and something more than .750. We don't know the value of the "somethings" so there is no exact answer other than the above (given the information we have). So again we have "regression to the mean", with value 2 still being .750, value 1 still .780, and the amount of regression or movement is unknown. The movement must be down since value 2 is less than value 1, and the limit of the movement is .750 since that is the value of value 2. (We can actually estimate the amount of the movement given some other paramaters but that is not important right now - we are only interested in whether "regression to the mean" is appropriate in each of these experiments, and if it is, what is the value of value 2.) BTW, as in experiment #2, value 2 happens to be the population mean, so the expression "regression to the mean" is somewhat literal, although again, that is somewhat of a coincidence.

Back to some more experiments (leading up to Tango's)...

Experiment #4:

We know the average OPS of all players is .750 and we know that all players have the same true OPS. This time, however, we select X players, not randomly, but all those who had an OPS in 1999 greater than .850. Let's say that there were 25 such players and that their average (sample) OPS was .912 (for that 1 year).

What is the average true OPS of our 25 players? Again, easy (trick) answer! It is still .750, since you are told that all players have a true OPS of .750. Again it doesn't matter what criteria we chooose to select players or what any player's 1-year sample OPS is. All sample OPS's that differ from .750 are due to statistical fluctuation (sample error), by definition. Again, "regression toward the mean", where the "regression" is 100% and the "mean" (value 2) is the mean OPS the population. So we have "regression to the mean" even thgough we did not choose a random sample of players from the population. We chose them based upon the criteria we set - greater than an .850 sample OPS in the year 2000.

Experiment #5 (same as Tango's):

Same population of players. It has a mean OPS of .750. Unlike the above experiment, each player can have a different true OPS. This time we only look at players who had a sample OPS of greater than .990 during the month of June in 2000. The average (sample) OPS of this group (say 50 players) is 1.120.

What is your best estimate of the true OPS of this group of players? (This question is of course exactly the same question as "What is your best estimate of what this group of players will hit in July 2000 or August 2000, not counting things like changes in weather, etc.?") Well what is it, and how do you arrive at your answer? Is your answer reasonable?

In order to arrive at your answer, was there any "movement" from the 1.120, like there was in the last experiment (the .912 had to be moved toward the .750 - in fact all the way to it)? In which direction 0 up or down? Why? If you did move the 1.120 to arrive at a different value for the "best estimate of the sample players' true OPS" how much did you move it? How much should you move it? Is there a limit on how much you should move it? If you did move it, is there a value 2 that tells us in what direction to move value 1 (the 1.120). How did you arrive at the value 2? Does this value 2 (like all value 2's are supposed to do) represent the limit of the movement? If not, why not, and what is the limit of the movement? Was there "regression to the mean" in deriving at your estimate of the sample players' average true OPS? If yes, what value represented "the mean"?

I'm not going to answer any of above questions in the last experiment. If you answer them (and the others) correctly, you will know everything there is to know about Tango's and similar experiments and whether there is or is not "regression to the mean", in the generic sense, in determining sample players' OPS, no matter how we choose our sample (randomly or not), and what value is represented by "the mean", given that "regression" simply means "some amount of movement"...


Banner Years

November 8, 2002 - MGL

Well, Frank, not only did you not answer my last few questions, but it is obvious to me that you know virtually nothing about basic statistical principles (at least not enough to have an intelligent discussion about this kind of analysis).

Your statement:

"Rey Ordonez has NO chance of showing up in Tango's sample, even if you simulated his performance over a thousand years."

is of course patently wrong. Rey Rey or even my 13 yo cousin does not have NO chance of showing up in Tango's sample. Everyone has SOME FINITE chance, no matter how infinitesimal. The tails of a bell curve reach out to an infinite distance.

That being said, the discussion (with you) is probably over. Neverthless, my sample will contain slightly better than average players whereas Tango's sample will contain distinctly better than average players. We all know this of course.

Neverthesless, the answer to my questions will be same whether we use my sample or Tango's sample. Both samples, of course, are non-random, which was the crux of your original criticism. Tango's is very non-random and mine is slightly non-random. At what point to you switch answers (that you do or don't use "regression to the mean" to estimate the sample's true lwts ratio or OPS and that "the mean" is not the mean of all players)? You don't like "one week" because it encompasses most players (again, it is still non-random as the worst players will tend to not have any or many one week OPS's above .980, whereas the great players will have many - in fact probably around half of all their weeks will be above .950)? What about 2 weeks? 6 months? One year? 2 years? 4 years like in Tango's study? At what point do we no longer "regress to the mean of the population" in order to estimate true OPS of lwst ratio? Your argument is silly!

The reason I used one week is to make it obvious to you that even if we take a non-random sample of players from the population of all players and our selection criteria is greater or less than a certain OPS or lwts ratio, that the estimate of true OPS or ratio in that non-random group must be "less extreme" than the sample result. You know that; everyone knows that. The one-week experiment makes it obvious. You are just digging in your heels at this point, for whatever reasons. Rememvber that what we use for value 2 (the so-called "mean") is simply the number that represents the limit of our regression.

Read this!!!!

Here is where you are getting screwed up, no offense intended. In the one-week experiment, you know intuitively that the true OPS for our sample players is actually somewhere near the mean of the population (a little higher), so you don't mind using the population mean as the number to regress towards (remember that number, value 2, just tells us DIRECTION and LIMIT). In Tango's experiment, you know intuitively that the sample group is mostly very good players with very high true ratios - which is true - so you don't like using the population mean as value 2. That's fine. The true OPS of Tango's sample players is nowhere near value 2 (the mean of the population); we still use it though to give us direction and limit - that's all value 2 is - remember? The concept of value 2 being the limit becomes silly when we choose mostly very good players of course, but we still use it to give us direction because it is the only KNOWN value that we have. In those extreme cases, we don't really NEED it, as we can just say that the true ratio of Tango's sample players is "something slightly less than 149", but we can also say that it is "149 regressed towards 105" (or whatever the mean ratio of all 5-year players is). That's all I have been trying to say.

Basically, Frank, once you establish value 2 as your direction and limit of your regression (and value 2 is always, by definition, the mean of the population), then you can decide how much to regress your sample mean (the 149 in Tango's case). Without doing a regression analysis or having other information about the distribution of true ratios in the poulation, you can only guess at how much to regress. In my case - the one-week guys - you regress a lot, which you know intuitively. Still doesn't change value 2, does it, even though you know intuitively that the final answer (the best estimate of the sample players' true OPS or ratio) is going to be close to value 2? Let's take 6-month players. Intuitively, you know to regress more than you do for the one-week players, but still a lot. Again, we still use value 2, or the population mean, to tell us direction and limit. If you didn't use a precise value for value 2, you might make a mistake and regress too much! For example, if we did the same experiment and used one month samples above an OPS of .980, you would know intuitively that their true OPS's were a lot less than that, right? Let's true to guess .850. Is that too high or too low? Without knowing the population mean and using it as our value 2 (the limit of the regression), we can't tell! If I told you that the population mean is .750, then you would say "Phew, that's not too low!" Infact, you would probably nopw say that their true OPS was around .780 or .800, wouldn't you? IOW, you need to have that value 2 in order to have SOME IDEA as to how much to regress! And that value 2 is always the popualtion mean, arbitrarily and by defintion, simply becuase you know that the true OPS of your sample of players cannot be less tha that (since you selected your players based on a criteria of some time period of performnce GREATER than the average). How about 1-year samples above .980. We know to regress even less, but we still need a value 2 to give us SOME IDEA how much to regress and to make sure that we don't regress too much! When we get to 4 year samples, as in Tango's study, we don't all of sudden say "No more regression!" We still regress, only this time a very small amount! Do we need to know what direction? Well it's kind of obvious since we know from ur selection criteria that our sample includes more players who got lucky than unlucky. So we know to regress downwards. Do we need to know exactly what the limit of regression is - i.e., do we need to know value 2? No, of course we don't, but it doesn't change the fact that, like the 1-week, 1-day, 1-year, or 3-year experiments, there still is a value 2, which is still the mean of the overall population, and that technically that number established the direction and limit of our regression. If in Tango's study you don't want to call it "regression to the mean", that's fine. Who cares? It is only semantics. If you want to call it "a little regression donwards" that's fine too. It doesn't change my discussion an iota! I hate when someone takes an "expression" (out of context), critices it (with valid criticism), and then uses THAT criticism to invalidate a person's whole thesis. Why don't you just say "I agree with your whole thesis (in fact, as I said before, there is no agreeing or disagreeing - I'm just parroting you proper statisitical theory in my own words and realting it to these baseball experiments), but I don't like sentence # 42?" It's like if I tell you the entire and correct theory of the origin of the universe (I know that it is debatable), and I finish off by telling you that the universe is contracting rather than expanding and you tell me that I'm compleltely full of crap (political commentators do this all the time)!

Here's an example, BTW, of where knowing value 2, and making sure that it is the mean of the whole population, IS improtant, even when we take multi-year samples. Let's say we did the same experiment as Tango, but we select players who were only 5% above average for 2 straight years. Let's say that the average OPS for that sample was .788. Even though we are reasonable certain that our sample is made up primarily of above average players (no Rey Rey's in this sample), do we need to know value 2 in order to "hone in on" how much to lower the .788 to arrive at an estimate of the true OPS of our sample? Sure! Let's say I don't tell you what the mean OPS of the population of ALL 3-year players (players who have played for at least 3 or 4 years) is? You might choose .775 as the best estimate of our sample players' true OPS. Whoah! If i them told you that the mean OPS in the population (of all 3-year payers) was .776, you would know that you went too far! If I told you that it was .750, then you would know that you were in the ballpark (no pun intented). So while sometimes knowing value 2 is important and sometimes it is not (it is obvious in which direction to regress and it is obvious about how far), it doesn't change the fact that we always HAVE a value 2, and that it is always the mean of the population!

Whew....


OPS: Begone!

May 20, 2003 - MGL

Good work Tango!

I think Darren might be right about Beane's "3 times the value" comment (see his post). I also think that while Beane and company are parsecs ahead of their competition, that there is a large gap between his and Depodesta's knowledge of and efficient and correct use of sabermetric principles and that of many sabers on Primer, BP, and Fanhome (and wherever else they might lurk). IMO, the A's would be far better off hiring someone like James, Voros, Tango, etc., than trying to do sabermetric analysis themselves. It is kind of like when Brenley asked Matt Williams to sac bunt last night. I'm sure he is capable of doing so, but...

Also, I think that teams will quickly start catching up to the "player acquisition" principles being espoused and used by Beane and company, especially since they are now being publicized in a maninstream book (nice job Beane). I don't think it will be very long before it wil be very difficult to pick up high production (high OPS or whatever) but not traditionally highly regarded players cheaply. The next frontier for picking up undervalued players will be and should be DEFENSE, and other Super-lwts components. It will be a long time before teams start using things like UZR to evaluate the overall impact a player will have on their team. Right now, one of the best ways to pick up valuable players cheaply and "sell" players expensively who are not all that valuable (buy low and sell high), is to look for large gaps bewtween a player's traditional defensive rating (scouting, reputation, etc.) and their UZR (or other good defensive metric - are there any others?) rating. (I think the days are numbered as to being able to do that for offense.) This should provide for plenty of value for a while I think. In the book, Beane implies that they use some kind of defensive rating that sounds suspiciously like UZR, via some computer company or something. Anyone know more about that?


SABR 101 - Relative and Absolute Scales (June 6, 2003)

Discussion Thread

Posted 6:11 p.m., June 7, 2003 (#3) - MGL
  Good stuff Tango!

Patriot, would like to see you expand on that thought. I always thought that BJ was brilliant (and a very good writer), but like many brilliant people, he can be considerably one-track...


SABR 301 - PZR - Blueprint (June 17, 2003)

Discussion Thread

Posted 1:23 a.m., January 14, 2004 (#4) - MGL
  I read through some of the old stuff and I'm still lost. Here is one of your equations, Tango:

Anyway, DER = UZR+Park+PZR.

Maybe I do get it. Are you saying that UZR measures how much better or worse a fielder handles line drives to zone 7S or hard hit ground balls to zone 56M, etc., wihtout regard to the distribution of those batted balls (e.g., we don't care if fielder A got 100 hard hit balls to zone 56M only and fielder B got 100 slow hit balls to zone 56M only - if they both fielded them at the league average, they would both have a UZR of zero), but PZR only considers the distribution of those different batted balls - i.e. PZR doesn't care which ones are actually turned into outs or not, only the league average out rate for each kind of ball in each zone (as well as the other parameters)? For example, if pitcher A had 100 hard hit balls in zone 56M only and pitcher B had 100 soft hit balls hit to zone 56M only, then pitcher B has a much better PZR? And we would calculate the exact PZR based on the league average out rates of hard hit balls and soft hit balls into zone 56M?

Aha, now I thinkI get it! I was thinking that PZR was like UZR in that it considered the actual out rate of each type of batted ball/zone/runners, outs, etc., and comapred that to the league average rates, yielding the same result as a pitcher's collective fielders. Now I see what you are doing! Brilliant! Now I also see how park adjusted UZR + park adjusted PZR = DER.

Of course, once we figure PZR, we still want to know how much of PZR is luck and how much is skill. I have a feeling that you already calculated that ahead of actually doing the individual PZR's. That must be from the team PZR's that you estimated from the team DER's minus the team UZR's. Is that right? And you came up with the fact the pitchers and fielders have about the same amount of responsibility? Is that right? And how much of each one's value (UZR or PZR) is skill and how much luck? I guess what that question always means is that for an infinite sample size, what is the regression? I think that is what that question means.

Hmmm... PZR. Briliant! I know Tango is now thinking, "What took that idiot (boor) MGL so long to figure this out?"

Let me know if I have this right now, and I'll do someof the preliminary work.

I assume that the only things yoiu can hold a pitcher responsible for, and you want to include in PZR is where the balls are hit, wht type and how hard. You can't hold him responsible for the other parameters, like baserunners (well, MAYBE baserunners), outs, and handedness of batters (other than how the pticher's handedness affects the batters' handedness), so I assume that you would want to "include" some paramteres in PZR, and adjust, but not include other parameters. In that way, it is a little different than doing the UZR calculations. Let me give an example of how I would caluclate a PZR and how I would hande the paramters issue, which is different from how they are handles in UZR (for PZR some of the paramters establish the baseline, and some of them are used to "adjust" the baseline; for UZR all the parameters are used for one and not the other). Correct me if I'm wrong here...

pitcher A

100 hard hit balls to zone 56M all with one out and by RHB's in 50 innings. That is all of his batted balls.

League averages

All hard hit ground balls to zone 56M are caught 60% of time with 0 outs and RHB, 62% 0 outs and LHB, 64% 1 out and RHB, and 66% 1 out and LHB.

All soft hit ground balls to zone 56M are caught 70% of time with 0 outs and RHB, 72% 0 outs and LHB, 74% 1 out and RHB and 76% 1 out and LHB.

All ground balls are caught 70% of time with 0 outs and RHB, 65% 0 outs and LHB, 75% 1 out and RHB and 70% 1 out and LHB.

All GB's are caught 70% of the time regardless of outs or batter hand.

If we don't want to penalize (or reward - whatever the case may be) the pitchers for the outs and the batter handedness, then we calulate for pitcher A:

If a league average pitcher gives up 100 ground balls with 1 out and a RHB at the plate, 75% are caught (see above league averages). However, pitcher A's 100 ground balls were all hit hard, were all hit to zone 56M (with 1 out and a RHB). A league average pitcher who did that would have only 64% of those kinds fo GB's caught (also, see league averages above). So our pitcher A allowed 11 fewer balls to be caught (regardless of how many were actually caught - that gets into the UZR realm), for a PZR of 11*.8 or 8 runs per 100 BIP or 50 innings, which ever we used as our "rate."

If we want to penalize (or reward) pitcher A for the fact that all his hits were by RHB and were with 1 out, then we would have to start with:

The league average conversion rate for ALL GB's, regardless of outs and batter handedness is 70%. That is the only thing that changes in our calculations. Now we take the difference between 70% and 64% for a PZR of 6*.8 or 4.8 runs (per 100 BIP or 50 innings).

Is that right? Should the pitcher be "penalized/rewarded" for any parameter other than speed, type, and location of batted ball? I don't think so. After all, we wouldn't think of doing that for park affects. We "adjust" for park affects. Why not "adjust" for outs/baserunners/handedness, and certainly batter G/F ratio (hmm.. do I adjust for batter G/F ratio in UZR? Probably not becuase there are so many batters it is not worth it - they probably average to near league average), as we should for park effects?


SABR 301 - PZR - Blueprint (June 17, 2003)

Discussion Thread

Posted 1:24 p.m., January 14, 2004 (#9) - MGL
  I probably agree with J. Cross on the handedness issue for most pitchers. Kind of like my explanation on the QOC article for not adjusting a pitcher's stats for opponent handedness in the QOC adjustments (but should do it for batters). There are exceptions though, like for LOOGY's, as I explained in the article. And of course, it would be a little unfair (one way or the other - good or bad), if a pitcher with not that many PA's happened to have faced more than his share of RHB's or LHB's, for no particular reason. Our concern might be somewhat baseless, as I'm sure the "adjustments" one way or another don't amount to much.

What about baserunners/outs? Should pitchers be responsible for any weird runners/outs profiles they have that signifciantly affect their PZR?

I;mnot even sure that we are going to gain with PZR's. I don't think it will help inpitcher evalaution or projection. After all, the regressions sort of take into consideration the inherent PZR's. Plus we can infer them quite accurately, by just "subtracting" their fielder's UZR from their stats. In fact, when I do my pitcher evaluationas and projections, I do a QOF adjustment which uses team regressed UZR (each fielder's multi-year regressed UZR added together, pro-rated by the distribution of that pitcher's BIP's).

Even when we get PZR, they still have to be regressed to "remove" the luck element. It might be nice to quantify what DIPS tries to ignore, but what is the purpose? Tango originally said something about using PZR to validate DIPS. I;m still not sure what that means. After you get the PZR and let's say it turns out that it is exaclty on the scale of UZR and that the regressions are about the same (as we think is true). What does that say baout DIPS? Only that a pitcher's BABIP is x part defense, x part pitcher, and y park luck. I think it has already been proven that: one, the pitcher does have some pretty decent control over BIP, and that two, the luck element is at least as strong as the defense element, probably much stronger...


UZR, multiple positions (July 7, 2003)

Discussion Thread

Posted 10:10 p.m., December 18, 2003 (#20) - MGL
  This is good stuff. I will have to re-read before I put out my Super-lwts again...


SABR 301 - Regression towards the mean (July 22, 2003)

Discussion Thread

Posted 12:39 p.m., January 14, 2004 (#3) - MGL
  I just realized why that website is so great! It's from my alma mater - Cornell U.!


Solving DIPS (August 20, 2003)

Discussion Thread

Posted 12:51 a.m., December 27, 2003 (#19) - MGL
  As I said on Fanhome, that is a phenominal article. It should make your head spin!

Anyway, looks like the only way for me to stop procrastinating is to go cold turkey. So, after this weekend, I won't be stopping by for a while, or reading anything else online. If someone wants me to post some links in Primate Studies, I'll be glad to do so, but I won't offer any of my thoughts on the matter. I'll be back in time for the World Series in a limited capacity.

I feel for you as much as anyone of course, as I periodically get addicted to Primer and Fanhome. However, how many times have you threatened to leave for a while and then come crawling back? ;)

Actually I need to do the same thing and concentrate on my real work and the book...


Results of the Forecast Experiment, Part 2 (October 27, 2003)

Discussion Thread

Posted 11:13 a.m., October 28, 2003 (#50) - MGL
  Tango,

What are the numbers for a "monkey" if the monkey uses a 3/4/5 weighting for the 3 years? How about a weighting plus a basic park adjustment (using a 3-year, or similar, "OPS park factor") for those players who have not played on the same team for the 4 years in question, or if you don't want to do that much work, a park adjustment for only those players who switched teams from 2002 to 2003?


Results of the Forecast Experiment, Part 2 (October 27, 2003)

Discussion Thread

Posted 7:00 p.m., October 28, 2003 (#59) - MGL
  I am not surprised at Marcel's success. If you factor in the park changes, Marcel probably (and should) blows everyone away! I've been saying (screaming) for years that projecting player performance is NOT rocket science nor does it take any special scouting, observational, intuitive, or even mathematical skills. It is simply "Monkey See Monkey Do," as Tango's experiment illustrates. I cannot say this enough. Barring some injury or other extraordinary factor, the best estimate of a player's performance is his last 3 or 4 years' performance, weighted and adjusted for age and context (park, opponent, etc.)! This is so important it bears repeating a hundred times or so (but I won't)! In fact, if you do just about anything else, you are probably going to do a lot worse than the sophisticated Monkey (Marcel plus context adjustments).

Although I was not able to participate in Tango's experiment this year (hopefully I'll have the time next year), my forecasting algorithm is available for all the world to see, and I'm sure my results would be somewhere near the top. I simply take each player's stats from the last 3 years, adjust them component by component for the strength of all of his opponents in those 3 years, adjust them for each park that a player plays in over those 3 years, and adjust each component for age (remember that aging curves look very different for each component). Then I adjust (to a healthy baseline) an entire year if that player was slightly injured, moderately injured, or severely injured in that year. Then I combine the 3 years using a 5/4/3 weighting system and regress each component towards the mean of an average player of similar height and weight. The fewer PA's a player has in those 3 years, the more each component gets regressed. Finally, if there is a continuing or new injury, I adjust the final stats to account for that injury.

I don't like to say this too much, because you get tons of flak from almost everyone other than hard-core sabermetricians, but, at least as far as player evaluation goes, for the purposes of projecting player performance, setting salaries, and putting together a successful team, you don't need to watch players and you don't need scouts (except perhaps for minor leaguers - even then, can you say MLE?). All you need are a player's stats! I live by this credo and I'll die by it! And I think that this whole experiment and discussion suggests that it is true at least to a large extent!

Seriously, how do you think most managers and GM's would do if they participated in this forecasting experiment? It would be so embarrasing it would be scandalous! Enrique Wilson (the best utility player in baseball according to Tim McCarver), Tony Womack, Neifi Perez, and Luis Sojo might be in someone's top 10!

This crap by some saber-types conceding that you have to combine sabermetrics with a "feel for the players," scouting, and other traditional evaluation techniques, in order to evaluate players and put together successful teams, is just that - a bunch of pandering, lip service crap - and I'm not afraid to say so!


Results of the Forecast Experiment, Part 2 (October 27, 2003)

Discussion Thread

Posted 8:31 p.m., October 28, 2003 (#62) - MGL
  I'll try and do my "after the fact" projections and see where they stand. Of course, no one will know for sure whether I cheated or not (I won't). My algorithm is basically what I described. The only thing I didn't specifically give (I'd be happy to) is what numbers I use to adjust for injury years, what numbers I use for park factors and age factors, and what my regression formula is.

As far as the 5%, 25%, etc. levels, such as Pecota does, personally, I don't think anything other than using regular old z scores are appropriate (IOW, if you have a .700 OPS projection, then there is a 5% chance that that player would have an OPS of greater than 2 SD above or below .700, where one SD is based on one year's worth of projected PA's. Anything other than that (such as what Pecota tries to do), is BS I think (I am not sure)...


Results of the Forecast Experiment, Part 2 (October 27, 2003)

Discussion Thread

Posted 9:04 p.m., October 28, 2003 (#64) - MGL
  I consider Voros the projection God! My algorithm is better now anyway!

Seriously, I think that any variation on Marcel, with park adjustments, is as good as any other and about as good as you can get. Plus there is obviously a fairly large sample error (margin of error) factor in the results...


Results of the Forecast Experiment, Part 2 (October 27, 2003)

Discussion Thread

Posted 12:58 p.m., November 1, 2003 (#75) - MGL
  Walt,

One of the problems is that there are two factors which determine what the "curve" will look like - one, the distribution of a binomial (will the player get a hit, a walk, a home run, etc., in each PA or won't he?), and two, the distribution of possible changes in true talent level, which is presumably based on things like chnages in age, physical condition, injury, "learning," and mental and psychological factors. The former should produce a normal curve, by definition - the latter, who knows? Pecota seems to focus on the latter and completely ignore the former. The former cannot be ignored. It will always exist and there is nothing that anyone can do about it. As I like to say, it is possible that certain players have a consistent talent level from day to day while others do not (for whatever reasons), but a player has no control over the random (actual semi-random, but then again, the throw of a die is semi-random as well) nature of the outcome of each PA. You (Walt) are talking only about the latter (changes in talent level from year to year) as well, whereas I am pretty much talking only about the former. But my contention is that the latter is either insignificant as compared to the former OR that it mimics the distribution of the former (it is bell shaped with a similar SD), so that the net result is a performance distribution which is approximately normal with an SD defined by the binomial distribution of OPS, BA, or whatever metric we are talking about...


Diamond Mind Baseball - Sending the runner on a 3-2 count (October 28, 2003)

Discussion Thread

Posted 1:23 p.m., October 29, 2003 (#4) - MGL
  1) I think it needs a lot more study to determine the "break even points."

2) I think assuming that the runners only gets thrown out 1/3 of the time is quite wrong. First, lots of non-base stealers (those who would get thrown out 2/3 of the time if they tries to steal a base) are sent on 3-2 counts.

3) I agree that batters are forced to swing more on marginal pitches with the runner going. Batters already swing too often on 3-2 counts (because they are afraid to take a called third strike). It is probably not a good idea to "force" them to swing even more often.

I think a good rule of thumb is to send above average base stealers only (then again, if they are above average, why were they not stealing before the 3-2 count?)

An interesting question is what about with runners on 1st and 2nd? Is it EVER correct to send the runners? Managers do this all the time.


Diamond Mind Baseball - Sending the runner on a 3-2 count (October 28, 2003)

Discussion Thread

Posted 6:07 p.m., October 29, 2003 (#5) - MGL
  In #2 above, I meant to add..

Second, for some strange reason, many otherwise fast runners and/or good basestealers seem to think that it is OK to get bad jumps and jog down to second on a 3-2 count, rather than the correct approacj which is to treat it as a straight steal...


ALCS Game 7 - MGL on Pedro and Little (November 5, 2003)

Discussion Thread

Posted 8:25 p.m., November 11, 2003 (#52) - MGL
  As I has already said, it is extrmemly unlikely that a pitcher who has the K, BB, and HR rate that Pedro has had after 105 pitches (his historical stats) AND is still throwing in the 90's with movement and any difference in control or command is not great (observation) has a true $H of anywhere near .400. It's just not possible. In order for a pitcher to have a true $H of .400 or so would HAVE TO throw a straight fastball in the 80's (or less) with little else in terms of command or offspeed pitches. That's one of the points that Tango makes and he is correct. Yes, of course it is possible for a goos major league possible to get so tired that he has the talent of an A-ball pitcher. It is likely, however, that no manager, in any situation, even Grady Little, is going to leave a pitcher in that long. It is also likely that that kind of drop-off would occur at a VERY high pitch count (I don't think that pitchers "drop off the table" - I think that it is a gradual, though not necessarily linear, decline - but I'm not sure and it doesn't really matter to this discussion) AND if Pedro were pitching with a talent anywhere near that of an A-ball pitcher, you would probably notice something severely amiss other than not throwing good 2-strike pitches.

Mike, I'm not sure at all what you were trying to do with your quick and dirty study. Even if a pitcher never had days where his talent level differed from any other day, he would have stretches where any combination of hard/soft/line drive/fly/ground balls were hit due to chance alone. And of course, all pitchers who were taken out after less than 4 innings by definition would have had an unusally high number of hard hit balls AND line drives. With all due respect, I really don't see what the point of presenting the data was.

What you want to do is to look at all pitching starts in which a pitcher gets hit hard in the first 3 innings (high percentage of line drives and high percentage of hits per GB and FB) and then look at their $H (or whatever stat you want to use to represent how "hard" they are being hit) in the next inning. That will tell you something about whether enough pitchers fluctuate significantly enough in their "stuff" (ability as opposed to performance) that it is "correct" to take out a good pitcher after he gets shelled for several innings even though you might be making a mistake (it might just have been bad luck - BTW Ross, there are 2 kinds of bad luck in this regard; one is when bloop hits fall at the right time and a bunch of runs score accordingly; the other is when a pitcher gets hit hard even though his true "talent" has not changed). My guess is that in the next inning all of these pitchers as a group will revert back to very near their normal stats (in fact, I'd bet a lot at even money that this is true). Like clutch hitting, such a test would not prove or disprove that pitchers' talent fluctutes from day to day or even from inning (or batter) to inning; it merely would evidence, one way or another, whether one could use getting "hit hard" in any given number if innings as a proxy for this talent fluctuation or whether there is not enough prevalence in this regard (either not enough pitchers who do fluctuate in talent and/or the fluctuations in talent are not that great) to be able to distinguish it from the inevitable fluctuation in performance due to the random (binomial) nature of the events. Again, my guess is that the idea that pitchers who are getting "hit hard" (or not) is significantly indicative of their true talent at the time (IOW, predictive of the future) is another of those truisms that turn out to be clearly not true. Tango and I have been debunking many such myths lately and will present some of them in due time...


ALCS Game 7 - MGL on Pedro and Little (November 5, 2003)

Discussion Thread

Posted 5:53 p.m., November 13, 2003 (#58) - MGL
  My apologies for putting words into your mouth Tango. I also think you are being a little too politically correct. :) What happened to the Cartman that I knew and loved?

Oh, and why do you bother?


David Pinto and fielding (November 10, 2003)

Discussion Thread

Posted 7:49 p.m., November 10, 2003 (#5) - MGL
  I haven't been following this thread too much, but yes, I regress most of the adjustments (I can't say ALL of them - I'd have to check) before I apply them - definitely the park adjustments. I use multi-year data for the adjustments and appply a regressed version to data in individual years (within that multi-year period).

For example, if the "ground balls through the infield" sample park factor at Dodger Stadium is 1.06 using data from 93-02 (hopefuly the infield has not changed much over the years - of course if an infield has changed, like in Phily and Tampa, I use different years and different PF's), I would regress that 1.06 to maybe 1.03 (grass infields get regressed towards .99 and turf infields towards 1.02, I think). That is the "ground ball" park factor (the 1.03) that I would use to adjust all ground balls at Dodger Stadium in any year (between 93 and 02). Actually, I don't think I regress any of the other adjustment factors (GB/FB pitchers, handedness, speed of batted ball). I think I use a 4-year sample adjustment factor, with no regression, but I'm not sure.

Interesting (and very good!) work by David. When I get some time, I'll check it out in more detail...


Pitcher's Hitting Performance When NOT Bunting (November 18, 2003)

Discussion Thread

Posted 1:41 p.m., November 19, 2003 (#3) - MGL
  To really assess how good a hitter the pitcher is, you need to remove all AB's where he unsuccessfully attempted to bunt at all, even if that wasn't the end of the at bat.

Actually those are grouped as bunt attempts. Since I have pitch by pitch data, most of the time I can tell when a pitcher has attempted a bunt and then switched to swinging away with 2 strikes.

In any case, Tango is right in that the above data is only when there is not a bunt situation, as I did not want to include those times when the infield was playing for a bunt and the pitcher swung away.

I was also able to calculate (not exactly, but pretty close) whether it is correct for a pitcher (or position player) to continue bunting or to swing away with two strikes or not (and what to do at the various counts, in terms of switching from bunting to swinging away or vice versa). Of course, that depends upon how good a hitter the batter is with 2 strikes versus how god a bunter they are. I can give the break even points OR tell you what an average pitcher (and position player) should do, however.

You'll have to wait for the book on that one though! ;) Good critical thinking on your part (questioning whether it is correct for a pitcher to still bunt or not with 2 strikes)! That is one of those (many) things that a manager would rightfully have NO IDEA what is correct or not, and to think that he does is both arrogant and stupid, since somewhere on this earth lies a person or two who could figure it out if they (managers) would only bother to ask!


Pitcher's Hitting Performance When NOT Bunting (November 18, 2003)

Discussion Thread

Posted 6:09 p.m., November 19, 2003 (#5) - MGL
  That's a good one! That could be taken either way though, and I'm not sure which way you mean it. Either way, you're (I'm) going to piss people off. I assume you mean that you are going to do most of the writing (which is good, since you write better than I do, except when you get cryptic), since we want people to actually buy the book, and controversy on the radio is often a good thing...


Pitcher's Hitting Performance When NOT Bunting (November 18, 2003)

Discussion Thread

Posted 6:14 p.m., November 19, 2003 (#6) - MGL
  One more thing:

and to think that he does is both arrogant and stupid, since somewhere on this earth lies a person or two who could figure it out if they (managers) would only bother to ask!

I thought that was one of my better statements! Actually it is a critcial point that needs to be made public (the folly of people in all walks of life making critcial decisions or offering opinions when they have no idea how to evaluate the merits of the various alternatives vis-a-vis their decision or opinion AND when those merits can and should be procured from someone else. The point can probably be made in not so harsh a fashion I suppose...


Persistency of reverse Park splits (November 20, 2003)

Discussion Thread

Posted 2:01 p.m., November 20, 2003 (#4) - MGL
  By the way, check out Sid Fernandez's home/road splits from his Met days. I bet those pass the significance test.

Did I just undermine my entire point?

To some extent you did. Looking at an extreme split (as compared to the average home park factor - i.e., a player who has a 1.20 home/road OPS ratio while playing for the Rox does not have an "extreme" split) for a player and doing a significance test on that is NOT the proper way to decide whether that sample split is a fluke or is "real" (or a combination). As is often the case with these types of questions, this is a Bayesian probability problem. First, you have to answer the question, "What is the distribution and magnitude of players in the population (of ML baseball players) who have unique true home/road splits that are different from the true splits of an average player for that home park?" Then, and only then, can you start doing "significance" tests on a particular sample split and some ensuing calculations. For example, if you answer the first question (the first part of the Bayesian calculation) with "There are no players with unique splits (such as if we were trying to find an association with a player's splits and the month of his birth)," then any weird sample split (even 4 standard deviations from the mean) is not going to suggest that the extreme sample split was anyhting but a fluke. That's how the analysis must be done.

People need to add one more important word to their baseball analysis vocabulary when it comes to these types of problems - "Bayes!"

And yes, not only did James find that there was virtually no such thing as a unique true platoon ratio in major league baseball (see my post in the Clutch thread about the T. Long trade a few days ago), at least for RHB's (and to a lesser extent for LHB's), so did the authors of the book "Curve Ball" and so did I.

Getting back to the Fernandez example and to home/road splits in general, if in fact we find that the regression for players' sample home/road splits is large, which my study suggests that it is, AND we know intuitively that some of a park's unique characteristics that go into its average park factor affect players differently (so it is unlike the "month of birth" example), what must be happening is:

1) The signal to noise ratio is low, probably due in part to the fact that we tend to forget or ignore that the sample size in splits is almost half that of a metric like OPS or BA;

2) there are probably only relatively few players who are significantly and uniquely (different from the average park factor) affected by a particular park; and

3) the effect of these unique influences is probably not that large.

This all leads to the conclusion, that using average park factors IS appropriate for adjusting player's home stats and that it does NOT do more harm than good and that using a player's overall stats and their home park average park factor is a VERY good way to predict their future splits, regardless of their sample historical splits, and that in evaluating trades, for example, we should not worry so much about players who have shown extreme and anomolous splits (like Nomar), as those extreme splits are most likely a fluke, absent compelling evidence to the contrary, and even then, we need to be very, very careful (as always) that we don't invent, exaggerate, or embellish "compelling evidence" to accomodate our beliefs.

Now really getting back to El Sid, he may be one of those players with whom you do have SOME compelling evidence at least that his extreme splits may have some merit in terms of accurately representing (with SOME regression) his true splits, given that he was, as Tango said, one of the most extreme fly ball pitchers in baseball history (he once pitched a complete game in which the infield had zero assists), and that he had relatively few balls put in play against him.

Because of the realtively small sample size of the original study, I did the exact same thing for 1999 and 2000. Here are the abbreviated results using the same parks:

There were 27 players (4499 PA's) in the 1999 sample with "reverse" splits. The average of the players' sample splits in the pitcher's parks was 1.13 (remember it "should" be .96). In the hitter's parks, whereas the splits of all players "should" be 1.04, the players with "revrese" splits had a composite split ratio of .89.

In 2000, 19 of the 27 players "survived." The players in the hitter's parks who had a "reverse" composite split of .89 regressed to a composite split of 1.11 and the players in the pitcher's parks regressed from a "revrese" split of 1.13 to a split of .96.

The conclusion is now stronger that, without knowing anything else about a player other than his sample one-year home/road splits, in order to estimate his "true" splits or predict his future splits (again, they are basically one and the same), one should ignore those sample splits and simply assume that his future or true split ratio will be approximately the same as the average player in the league.

BTW, in Tango's lead-in to this study, he meant (or at least, he should have meant) "...in reverse of the 'park factor' of his home park" and not "the 'HFA' (home field advantage)..."


Persistency of reverse Park splits (November 20, 2003)

Discussion Thread

Posted 1:43 a.m., November 21, 2003 (#6) -