author topic   this topic is 5 pages long:    1   2   3   4   5  
tangotiger posted October 7th, 2000 12:18 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

I applied the adjusted LW by batting order values to every hitter since 1975 to see what kind of effect the player has on the batting order.

First off, the average batter's best hitting spot versus worst hitting spot is a range of about 3 runs / 600 PA. That works out to about 4% between best and worst case, which pretty much matches other studies on this subject.

But of course, some players have a much wider range. The player whose spot in the batting order is most important is none other than.... Rickey Henderson, 1982, then 1983, then 1980 and then tim Raines 1981. No big surprises.

About 10% of hitters are batting-order neutral, meaning that they would perform as well in any hitting spot.

The 2 spots by far with the greatest range of profile is the #1 and #2 spots. Those 2 spots have the most specific requirements for a profile.

Anyway, here are the players most suited for the 9 hitting spots (they were normalized so that the sum total for each player is zero runs, otherwise Mark McGwire might turn out to be a great hitter in all 9 spots).

leadoff: Rickey Henderson, Tim Raines (worst suited: Juan, Matt Williams, Andre Dawson)

2nd: Joe Morgan, Willie Randolph (though Henderson and Raines appear all over here as well). worst suited: Dave Kingman

3rd: the range is very small here, and no one stood out in either side

cleanup: Mike Piazza, and incredibly Tony Gwynn and Wade Boggs.... as it turns out the 2 most important qualities for a hitter here is a homer guy AND singles hitter... basically, if you already have a guy in scoring position, a single will score the runner jsut as well as a homer would
worst suited: Vince Coleman

5th: Juan Gonzalez. worst suited: Rickey Henderson

6th: no one player stood out, but Dave Kingman appeared a few times. worst suited: Rickey

7th: a whole mish-mash of players, alot bad ones. worst suited: Rickey, Joe Morgan

8th: MAtt Williams, Dave Kingman. worst suited? Rickey

9th: another mish-mash of players... worst suited: Doug Flynn

Finally, jsut pickign out a few players, this is their best hitting spot:
Dwight Evans - 2nd, maybe 1st

Cal Ripken - 3rd

Marquis GRissom - MGL, you won't believe this, but it's leadoff! the reason is that he is totally NOT suited for 3rd, 4th, and 5th, and he's too good a player to bat 6th, 7th, and 8th.... his 2nd best spot is 9th

Wade Boggs - 2nd, by far, with a decent leadoff or cleanup

Derek Jeter - 2nd

Tim Raines - leadoff in the 80s, leadoff or 2nd in the 90s

Frank Thomas - 2nd

Mark McGwire - 9th!! as it turns out, his profile is best suited in the 9th spot.... HOWEVER, since the 9th hitter gets far less hitting opps than the top of the order, that is a ridiculous thing to do... his 2 other best spots are 2nd and cleanup

Sammy Sosa - 6th

Barry Bonds - again, 2nd or 9th.... the 9th is ridiculous for the same reason as big mac

Butch Hobson - 6th

Rey Ordonez - 7th

Anyway, that's it for me....





IP

Warren posted October 7th, 2000 12:38 AM find more posts by Warren    edit/delete message   reply w/ quote
Senior Member
Member Since: Dec 1999
Location:

Not to belittle the good work being done in this thread, but I imagine the difference in plate appareances by batting order position is too large to ignore, not just for players whose "best position" is ninth. When you're talking about a 3 run shift for an average batter, those extra plate appearances are crucial. Would you really want Matt Williams or Dave Kingman batting that low in the order?

I don't imagine adding a plate appearance adjustment to this analysis would be too much work. Of course, that's easy for me to say, since I'm not doing any of the work

IP

David Smyth posted October 7th, 2000 08:49 AM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

Nice work.

I agree with Warren about a PA adjustment or something like that.

What is the typical profile for the "batting-order neutral" players? Is it just a neutral profile?

If you had Rickey Henderson, circa 1980, on your team, you would have to consider which slot suits HIM best, but also which slot suits the TEAM best--including how good Hend. is overall (OTS), and who and how good are the other batters. So if Hend., 1980, was clearly the best hitter on the team, where should he bat? Probably not leadoff, but I'm not sure.

It would seem like, everything else being equal, a BO neutral profile should be preferred for its versatility.

IP

tangotiger posted October 9th, 2000 01:57 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

You both make excellent comments, and I agree with them. I was thinking about both topics as well in the past.

The PA adjustments absolutely becomes critical with the batting order positioning giving such little runs. I'll see if I can work that out....

As for the A's 1980, etc, I was going to use my favorite teams (Expos and Redsox), and see how they should have batted their players in 1994 (Grissom, Walker, Alou, etc) and 1986 (Boggs, Evans, Rice, etc). These 2 teams have interesting players with interesting profiles, and I wanted to see how the team most benefited from the batting order. I'll see if I can get to it this week.

As for the batting-order neutral, that is also another study. The thing is to run a sim with 9 equal players, and see where does the variance come from. Is it mostly by the 9 particular spots? And if we introduce real typical players in each of the 9 spots, how do the LW values change for each of the 9 spots. Lots of questions to answer.

I think, for now, the most important thing that we may have discovered is the sheer value of the #2 hitter, and how he shouldn't be just some slap no-strikeout hitter. He might turn out to be your best hitter.

IP

tangotiger posted October 9th, 2000 06:59 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

The difference in the number of PA between each spot in the batting order is 15 PA. If we look at the typical superstar, and his +60 runs / 600 PA, we see that dropping him down a spot in the order will cost the team 1.5 runs. Having McGwire, whose profile is optimal as the #9 hitter, will cost about 60 PA or so, or about 6 runs. Since his advantage in the #9 slot is just a couple of runs, it makes no sense to bat him that low.

Hopefully, I can run through a couple of teams with thsi modified methodology to come up with the optimal order. Unless someone has a special request for a team (1999 or earlies) with a really weird makeup, I'll stick with the 2 teams I mentioned earlier, and run something with them.

IP

tangotiger posted October 11th, 2000 04:04 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

The 1994 Expos:
C - Fletcher
1B - Floyd
2B - Lansing
SS - Cordero
3B - Berry
LF - Alou (great year)
CF - Grissom
RF - Walker (great year)

The optimal batting order, assuming my adjustment's to MGL weights, AND adjusting for PA as described in my previous post is:
1 - Grissom (sorry MGL)
2 - Berry
3 - Walker
4 - Alou
5 - Cordero
6 - Fletcher (or 7 or 8)
7 - Lansing (or 8)
8 - Floyd (or 6 or 7)

This lineup would produce 11 runs above a random batting order in a season (about 2% which is roughly what all the studies have shown).

Moving Grissom down to #2, Walker to leadoff, Alou to #3, Cordero to #4 and Berry #5 (hardly a good looking batting order) produces 0.3 runs less in a season! zero point 3.

Almost all the top players' optimal batting spot was #2. However, Berry loses alot if you put him to #1,3,4,5 and therefore waws best for the team for him to be #2. I can run through different scenarios of juggling the players in different spots but the conclusion is quite clear (for the 94 Expos): any half-reasonable lineup would have worked.

And with the PA adjustment, Rickey Henderson turns out to be even better. While Boggs and Gwynn lose their luster in the cleanup spot, the ideal cleanup hitter remain Mike Piazza. McGwire's ideal becomes #2 (and #9 falls completely off), simply because his walks lose too much value in the cleanup spot. Again, though, since most people's best spots are #2 (like Walker and Alou), it may turn out that McGwire's ideal is #4, given the other players available.





IP

David Smyth posted October 11th, 2000 10:47 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok


I'd like to see a clear comparison of lineup slot values.

The way it could be done is first to determine the total number of PApp for every slot per 162 team games.

Then, using the PBP database, the average run expectancy per PApp could be calculated for each slot.

Then, multiply the two together and you have a total value for each slot. As a guess, I'd expect the leadoff man to be from average to 10% above average, the 3rd batter to be about 30% above average, and the 9th batter to be about 40% below average.

I think a batter's overall ability, relative to that of the other hitters in the lineup, is considerably more important a consideration as to his best lineup slot than his 'style' or 'profile' as a batter.

IP

tangotiger posted October 12th, 2000 01:08 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

David, I posted the LW values I am using in a post a while back (MGL's weights, but adjusted). All I'm doing is applying these LW for each player for every batting spot in every year, and using the PA for the particular batting spot as I also mentioned.

Then I simply try a whole bunch of combinations for a given team, until I get the order that produces the most runs.

I just did the 86 Redsox. The optimal order came in at 18 runs better than the random order, with almost half of that gain due to Wade Boggs being put as either leadoff or 2nd.

1 or 2: Boggs, Evans
3: Gedman
4: Rice
5: Buckner
6: Baylor
7: Armas
8: Romero
9: Barrett

The important ones to get were Boggs and Evans.

In actual fact, a player like Rickey Henderson (though there is no one like him) can add more runs by putting him at leadoff and putting the other 8 guys in random order than putting Rickey in the worst spot, and put the other 8 guys in their optimal order. For example, in 1983, Rickey Henderson batting leadoff adds 13 runs to his normal LW, but batting 7th loses 6 runs. That 19 point swing is practically impossible to make up with the other 8 guys, even under the best of conditions (as shown above with the Sox and Spos).

IP

mgl posted October 12th, 2000 03:49 AM find more posts by mgl    edit/delete message   reply w/ quote
Senior Member
Member Since: Apr 2000
Location:

I just crunched the 93-99 NL and AL PBP data. Here are some of the results:

Relative PA by batting slot/PA per game

NL

1 1.100 4.72
2 1.074 4.61
3 1.048 4.50
4 1.024 4.40
5 1.000 4.30
6 0.974 4.19
7 0.948 4.07
8 0.922 3.96
9 0.895 3.85

AL

1 1.100 4.77
2 1.074 4.66
3 1.048 4.55
4 1.025 4.45
5 1.000 4.34
6 0.977 4.24
7 0.951 4.13
8 0.925 4.02
9 0.898 3.90

As you can see, the NL and AL ratios are almost the same. Also, every slot gets almost exactly .11 PA per game more than the slot below it. This (the difference between adjacent slots) is 17.8 PA per 162 games or around 2.35 runs per 162 games.

Here are the frequencies in which each slot appears with the 24 possible bases/outs situations:

NL

0 outs
--- .414 .181 .172 .248 .240 .219 .231 .239 .217
x-- .035 .094 .048 .044 .059 .055 .052 .054 .058
-x- .008 .036 .018 .017 .019 .020 .017 .016 .015
--x .001 .007 .004 .003 .003 .003 .003 .003 .003
xx- .006 .010 .023 .015 .013 .016 .015 .014 .014
x-x .003 .002 .011 .006 .005 .006 .005 .005 .005
-xx .001 .003 .006 .004 .004 .004 .004 .003 .003
xxx .002 .002 .003 .006 .004 .004 .004 .004 .004

1 outs
--- .135 .285 .132 .125 .180 .177 .162 .170 .175
x-- .043 .047 .100 .061 .052 .069 .067 .061 .066
-x- .033 .025 .054 .031 .026 .031 .032 .031 .031
--x .009 .008 .023 .011 .009 .010 .012 .010 .010
xx- .015 .019 .019 .039 .029 .024 .028 .027 .025
x-x .006 .007 .008 .019 .014 .011 .011 .012 .011
-xx .008 .006 .007 .012 .010 .008 .009 .009 .008
xxx .006 .006 .007 .008 .014 .011 .009 .011 .010

2 outs
--- .119 .104 .217 .114 .108 .147 .144 .135 .137
x-- .042 .050 .048 .094 .061 .054 .066 .066 .070
-x- .045 .036 .033 .056 .042 .036 .041 .041 .036
--x .013 .017 .015 .027 .018 .015 .017 .017 .016
xx- .022 .025 .024 .026 .045 .036 .030 .034 .040
x-x .010 .010 .013 .013 .023 .018 .015 .016 .018
-xx .014 .009 .008 .008 .013 .011 .011 .011 .010
xxx .009 .010 .009 .010 .010 .016 .013 .011 .017

AL

0 outs
--- .394 .174 .174 .246 .238 .220 .231 .232 .232
x-- .043 .094 .045 .046 .061 .058 .054 .057 .057
-x- .013 .033 .017 .015 .019 .018 .016 .017 .017
--x .002 .007 .004 .002 .002 .002 .002 .002 .002
xx- .011 .012 .024 .014 .015 .017 .016 .014 .014
x-x .004 .004 .012 .008 .005 .006 .006 .005 .005
-xx .002 .003 .006 .004 .004 .005 .004 .004 .003
xxx .003 .003 .003 .006 .004 .004 .005 .004 .004

1 outs
--- .135 .270 .129 .128 .177 .173 .161 .169 .169
x-- .050 .050 .099 .059 .055 .071 .069 .063 .063
-x- .028 .028 .054 .029 .024 .027 .029 .029 .033
--x .009 .011 .020 .011 .007 .009 .010 .009 .010
xx- .020 .022 .020 .039 .029 .026 .029 .028 .025
x-x .009 .009 .010 .022 .014 .011 .013 .013 .011
-xx .009 .008 .009 .012 .009 .009 .010 .010 .010
xxx .006 .008 .007 .009 .014 .012 .012 .011 .010

2 outs
--- .109 .104 .206 .112 .111 .142 .141 .133 .139
x-- .048 .048 .050 .098 .063 .060 .071 .069 .066
-x- .035 .036 .035 .055 .039 .032 .035 .039 .038
--x .014 .017 .016 .025 .015 .013 .014 .017 .015
xx- .024 .027 .025 .027 .049 .038 .034 .036 .034
x-x .012 .012 .014 .015 .023 .019 .015 .016 .016
-xx .010 .010 .010 .008 .013 .011 .010 .011 .012
xxx .009 .010 .010 .011 .011 .018 .014 .013 .013

Notice the high frequenies of runners on 2nd, 2nd and 3rd, and 3rd, with 1 out (or 2 outs in the 1 hole in the NL) after a slot which might sac bunt.

One last set of charts. Here are the lwts by batting order slot. This time I used neutral "before" RE and neutral "after" RE. (i.e. I used the SAME (average) RE's for any slot in the BO; i.e. there were only 24 RE's used). The best lwts to use to test a batting order are probably somewhere in between the ones I posted before and these. The problem we have is this: If we test a fairly typical #8 hitter in the number 8 slot surrounded by typical hitters in the other slots then we can use my original lwts for a #8 hitter. However, if we test a great hitter in the #8 slot, those lwt values are very bad. In the case of the great # 8 hitter, we would need to use lwt values that were based on a much higher "before" RE. We have a similar problem if we were testing ANY kind (good or bad) of #8 hitter in the 8 hole, if the subsequent hitters (#9, #1, etc.) were better than average. The "before" AND the "after" RE's would be higher than we thought.

Anyway, here are the "revised" lwt values (based on average, batting slot neutral "before" and "after" RE's):

Oh, and I also list the average RE for each slot, as David asked about. The first RE value is "actual" RE, which takes into consideration the ability of the average player in each slot. The second RE is an "abitlity neutral" RE, which assumes an average hitter in each slot. IOW, the second RE (abiltity neutral) is hypothetical; it only takes into consideration the average bases/outs frequencies for each slot. It assumes the same (average ability) batter in each slot. Obviously, for the neutral RE's, the bases/outs frequencies for each slot are based on "real" batters in the previous slots (otherwise neutral RE's would be the same for every slot).


"real" RE, "neutral" RE, out, bb, s, d, t, hr, sb, cs

NL

1 .536 .498 -.259 .30 .42 .70 0.98 1.26 .18 -.40
2 .577 .515 -.262 .30 .44 .72 1.04 1.35 .22 -.47
3 .583 .515 -.262 .28 .45 .77 1.08 1.40 .20 -.45
4 .559 .541 -.283 .30 .49 .83 1.11 1.45 .18 -.40
5 .506 .536 -.283 .31 .49 .82 1.11 1.44 .19 -.43
6 .451 .522 -.276 .30 .46 .78 1.05 1.42 .20 -.45
7 .416 .519 -.276 .30 .45 .77 1.13 1.39 .19 -.44
8 .413 .518 -.273 .30 .44 .76 1.09 1.36 .18 -.44
9 .430 .515 -.277 .31 .44 .77 1.18 1.45 .20 -.57

AL

1 .583 .557 -.287 .34 .47 .73 1.00 1.29 .16 -.42
2 .609 .565 -.289 .33 .49 .77 1.03 1.37 .18 -.52
3 .604 .560 -.291 .30 .49 .79 1.01 1.41 .18 -.47
4 .587 .577 -.311 .32 .51 .82 1.05 1.43 .19 -.42
5 .546 .570 -.308 .34 .51 .82 1.12 1.44 .18 -.45
6 .508 .563 -.304 .33 .49 .80 1.14 1.42 .18 -.46
7 .491 .562 -.303 .33 .49 .80 1.07 1.40 .17 -.48
8 .490 .557 -.301 .33 .49 .80 1.10 1.41 .18 -.49
9 .517 .553 -.298 .32 .48 .78 1.00 1.41 .18 -.49

As we've said before, your #1 and #2 hitters better have a good SB/CS ratio! Also, you don't want to put high HR guys in the #1 or #2 holes (all other things being equal). Also, in the AL (and all other high scoring environments) there is a high premium on high average and on-base guys (outs are very costly). Of course, this would also be true in NL hitters' parks like Coors and Enron. In the NL (and other low scoring environments), there is a high premium on triples, home runs, and to some extent stolen bases (assuming a good ratio).

Good luck on analyzing these numbers!!

IP

tangotiger posted October 12th, 2000 01:57 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

MGL, thanks alot for your hard word.... You've just given me more toys to play with..... Thanks

IP

Voros posted October 12th, 2000 02:42 PM find more posts by Voros    edit/delete message   reply w/ quote

Member Since: Jan 2000
Location:

I'll just pop in here that that 17.8 PAs per slot in the order is EXTREMELY logical. Every game ends at a particular spot in the order. There are 162 games and 9 slots in the order. 162/9 = 18.

Theoretically, every slot in the order is worth 18 PAs. The actual data supports this. So I think we can conclude that whatever slot in the order ends the game is more or less evenly distributed.
__________________
Voros McCracken

IP

David Smyth posted October 12th, 2000 05:55 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

For my first look at these interesting numbers, I did what I mentioned above, which is multiply, for each slot, the average (real) RE by the average PA/G. This should produce an overall 'potential value' for each slot. Here is the result, put on a scale of 100 = average.

NL: 118, 124, 122, 114, 101, 88, 77, 76, 77

AL: 116, 119, 115, 109, 99, 90, 85, 82, 85

The general idea I get here is that lineups should be 'front-loaded' more than they typically are. That, instead of centering around the 3rd and 4th batters, they should center around the 2nd and 3rd batter. The 4th slot is too far from the beginning to put the team's best hitter, as is frequently done. An effort should be made to develop better lead-off skills in players with low power. The AL strategy of using a '2nd leadoff' hitter in the 9th slot seems valid. In fact, on an NL team with a very good top of the order, it may be correct or at least reasonable to bat the pitcher 8th.

Of course, these numbers are based on the way they're actually constructing lineups. If they started making all these changes, the REs would also change, and the calculations would need to be done again.


IP

dodgerdog posted October 12th, 2000 07:19 PM find more posts by dodgerdog    edit/delete message   reply w/ quote
Senior Member
Member Since: Oct 1999
Location:

This may be just because I am a Dodgers fan...but it seems to me that the Dodgers would/could be a perfect team to do exactly what you are describing for a couple reasons...
Mainly, the Starting Pitchers have some decent pop, and
Tom Goodwin would be the leadoff hitter.

Just to use their 2000 lineup:

Goodwin, Grudzielanek, Sheffield, Green, Karros, Hundley, Beltre, Cora, Pitcher.

I was thinking that something like....

Beltre, Green, Sheffield, Hundley, Karros, Grudzielanek, Cora, Pitcher, Goodwin...

Would the 2000 Dodgers work as an example for a team who should employ the *Larussa Way* (Goodwin seems near perfect for the role)
__________________
Woe is the Dodgers Fan

IP

tangotiger posted October 12th, 2000 09:01 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

As I've mentioned, the two spots where there is the most variation is the leadoff and 2nd batter, and this is the reason that the best hitters in the game have their highest LW values in the #2 spot. The worst hitter would fall in the 8th spot, though when you are talking about a pitcher, I'd have to play with the numbers to see if this is true.

And as David surmised, yes, the cleanup hitter is so far away from the top, and he loses so many PAs relative to the top, that it requires a player of a particular profile to bat here. Someone like Mike Piazza is the ideal. High-HR, high-average, not high-walks.

Voros, excellent pickup there. Sometimes the most obvious things are hidden in front of our eyes.

IP

tangotiger posted October 12th, 2000 09:21 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

MGL, one thing that strikes me as interesting with your batter-neutral REs: look at the outs. Way back when, using simple percentages, I concluded that the value of the typical average hitter batting leadoff would have an out value of .29 compared to the league average of .30. That's pretty much what your chart shows. But, since the best hitters are coming up after him, then a value of .30 turns out to be reasonable (as also demonstrated in your original chart). Look at all the other hitters, and it would not be a stretch to say that all outs should basically get -.30 runs, assuming "real" batters hitting after them.

I'll have to mull through the other numbers as well....

IP

mgl posted October 12th, 2000 09:49 PM find more posts by mgl    edit/delete message   reply w/ quote
Senior Member
Member Since: Apr 2000
Location:

Tango,

Sure. As I said before, the subsequent batters don't have that much influence on the out value. However, the quality of the batter we are looking at has a lot of influence on HIS OWN out value. This is kind of weird. (A batter with a lot of outs has a low out value, while a batter with few outs has a high out value. This suggests that, as you keep saying, maybe we should just use a constant value for the out, regardless of the player or the slot!) In fact, the more I think about it, the only adjustments to the lwt values should be based on the different bases/outs frequencies for each slot.

[Edited by mgl on October 12th, 2000 at 08:51 PM]

IP

tangotiger posted October 14th, 2000 03:02 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

MGL, couldn't agree more on the frequency of base/out thing.

First off, thanks to MGL for providing the frequency matrix of the 24 states for each batting spot.

Now, this is what I've done. Using the overall frequency rate and the overall RE values for the 24 base out situations, I compute the expected chance of scoring from 1B, 2B, 3B. For example, the man on 1st, no out sitatuation has an RE of .95, while the no out state is .58. That difference, .39, is the value of the runner being on 1B. That is, he has a 39% chance of scoring from 1B with no outs. Do the same with one and 2 outs, and you get 27% chance, and 13%. Now, what is the chance of a player getting on base in each of those 3 situations? 35% of all PAs are with no outs, 33% with 1 out, and 32% with 2 outs. Multiplying all this out, and we get .265, or 26.5% chance of a hitter getting a single and eventually scoring. Do this for all the hitters events, and you get the run-scoring part of LW.

Next, figure out the baserunner-moving ability of the hitter. Following the same concept, but this time applyign the frequencies as the batter sees them with a particular runner on base (more likelihood of runner being on 3rd with 2 outs rather than 0 outs, thereby reducing the chance that that runner will score, and thereby INCREASING the value of driving him in, etc), we see that moving a runner from 1B to 2B adds .16 runs on a single. Move him to third, and that adds another .17 runs. Assume that 32% of the time a single will give the extra base, and realizing that 32% of the time there is a runner on 1B etc, etc, etc, yields a run value of .22 for a single to advance all runners on base. Add .22 to .26 for .48 and that is the LW value of the single.

Here are the complete LW results:
s + 0.483
d + 0.780
t + 1.047
hr + 1.426
bb + 0.344
sb + 0.167
cs - 0.449
out -0.311

First off, these numbers are nearly identical to MGL's, even though I took a totally different approach. The big advantage I gain is that I don't lose sample size as I am about to get into more specifics here, with the batting slots. The only leap assumption I made is that all rates of production are distributed normally in all 24 base out situations (HR rate same with men on base as with bases empty). This explains why my HR LW value is slightly higher. The BB is slightly higher because I did not consider IBB. The out value is somewhat off, and I'm not sure why.

Anyway, now that I did that for the league average, all I have to do is apply MGL's frequency table for each batting spot, and in 2 seconds, we can get the LW values by batting order. (My assumption carries through everywhere, thereby negating its effects.)

Here's what the leadoff LW values look like, using the AL leadoff frequency base/out rates:
0.461 0.743 1.000 1.332 0.347 0.167 -0.449 -0.297
Here's what they look like when compared to the neutral LW values from above:
(0.022) (0.037) (0.047) (0.095) 0.003 0.000 0.000 0.013
(Parenthesis means negative).

So, we see that the leadoff hitter loses alot of his HR value, and some of his hit values. He gains very little walk value, and his outs are not as damaging as other hitters.

Note that all this assumes an average batter coming up after him.

Here are the LW values for each spot:
1 (0.022) (0.037) (0.047) (0.095) 0.003 0.000 0.000 0.013
2 (0.006) (0.018) (0.020) (0.038) 0.002 0.001 0.000 0.006
3 (0.009) (0.019) (0.032) (0.004) (0.016) 0.000 (0.001) 0.004
4 0.016 0.029 0.037 0.051 (0.001) (0.001) 0.000 (0.012)
5 0.018 0.030 0.039 0.041 0.007 (0.001) 0.000 (0.009)
6 0.007 0.013 0.018 0.025 0.006 (0.000) (0.000) (0.004)
7 0.001 0.006 0.009 0.017 0.003 (0.000) 0.000 (0.002)
8 0.002 0.006 0.008 0.016 (0.000) (0.000) 0.000 (0.001)
9 (0.003) (0.004) (0.006) 0.003 (0.004) 0.000 0.000 0.003

So, to figure out the LW of a player for a batting spot, figure out his LW as you normally would. Apply the above adjustments for the particular spot. (Remember, these adjustments are based on the frequency of him coming up in the 24 particular base/out states, and their respective RE). Then apply the PA adjustment as MGL pointed out. And there you have it, finally, the LW by batting order.

And if you take a quick glance, while it seems the top 3 hitters have a negative adjustment, remember that they have a positive adjustment for PA, which is the whole reason that all this works out in the wash, and the batting order yields a miniscule amount of run deviation.

IP

mgl posted October 14th, 2000 05:53 AM find more posts by mgl    edit/delete message   reply w/ quote
Senior Member
Member Since: Apr 2000
Location:

Tango,

Good work! I can plug your values into a program and in about 1 second (or less) compute the ideal batting order for any team! Of course, every team would have a different BO v. lefty and righty pitchers. You could even compute a unique ideal BO for each game, depending upon who is pitching (use a log5 formula for predicting the outcome of each offensive event), the park, weather, etc. Do you think any ML team would be interested in a program like this? (That is a rhetorical question, BTW!) In reality, only 2 ideal lineups would suffice for a team - one v. LHP's and 1 v. RHP's. If I get a chance this weekend, I'll write such a program and e-mail a (virus-free) DOS version to anyone who is interested.

IP

tangotiger posted October 14th, 2000 12:53 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

Actually, the other thing I need to make it complete is the frequency rates of SB attempts. It was a fair thing to say that a batter does not have much control over his hitting prowess with the 24 states. However, SB are totally something else. (This is why there is really no adjustment for SB in the LW values above).

First thing to decide is if we should distinguish between SB and CS, or just do SB attempts, and apply league average SB%. Undecided.

The other thing though is that I would only need the overall league average frequency rates for the 24 states. The reason is that the frequency rates for the leadoff hitter will be so skewed simply because of the disproportion of the players there. I'm still undecided on this one as well. Maybe, if you have can produce both (by batting spot, and overall).

After I get all this, I have everything in one spreadsheet, which I will be happy to provide to anyone. With this spreadsheet, you can now plug in any frequency rates for the 24 states, and it will tell you exactly the LW values for each of the offensive events.


IP

David Smyth posted October 14th, 2000 09:16 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

Tango,

Yes, good work.

Before I mess around with this stuff, could you specify 1) what was the data you used? 1993-99 cumulative both leagues together? Something else? 2) what was the run expectancy (to 3 decimals) of the initial 0-0 state?

IP

tangotiger posted October 15th, 2000 12:30 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

I used MGL's AL frequency matrix for each spot for the 24 base/out, and the overall RE for AL for the 24 base/out. The 0/0 RE was .56.

A "bad" thing is that I assume that the next batter is also league average. In the case of the #8 NL hitter, this is totally wrong, and therefore, while 27% of singles results in runs, then maybe only 23% results in runs for the #8NL hitter. This is just to reiterate that the hitters do not operate in a vaccuum.

However, what we've done is simply to look at the batting order spot specifically, based on the batters BEFORE you, to see how often you come up with each situation, and how that impacts your offensive events.

The next thing to do, and MGL has provided me with enough data to do so, is to account for this. But this is a REALLY LONG process. (What you need to do is use the RE for each spot to come up with the %chance of scoring from 1B, 2B, 3B, and use THOSE figures, by batting spot. The run-scoring part of LW is dependent on the batters following you. The other part of LW, the base-runner movement part does not change, since the current batter IS the next batter, from the runner's viewpoint, and we have constructed our model based on the current batter ALWAYS being league average.)


IP

tangotiger posted October 15th, 2000 01:24 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

MGL, another favor, if I may. Would you be able to spit out the "real" actual RE base/out matrix for each batting spot? I use these things to determine the chance of scoring from any base with any out. I thought you had posted it, but I only see the resultant LW values of that.

Based on my calculations, here are the chances of scoring from each base overall:
1B, 0,1,2 outs, overall: 0.390 0.266 0.130, 0.265
2B, 0,1,2 outs, overall: 0.636 0.414 0.237, 0.434
3B, 0,1,2 outs, overall: 0.878 0.674 0.288, 0.621

You can't simply add them up since there are more PAs with 0 outs than 2 outs.

What I want to do is come up with a similar set for each batting spot and see if there are much differences.


IP

mgl posted October 15th, 2000 08:01 AM find more posts by mgl    edit/delete message   reply w/ quote
Senior Member
Member Since: Apr 2000
Location:

Here are the 93-98 bases/outs matrices by batting order...

AL
......1...............2..................3..................
0.62 0.34 0.11*****0.62 0.34 .12*****0.58 0.35 0.14
1.03 0.57 0.23*****1.07 0.62 .25*****1.05 0.64 0.29
1.29 0.77 0.36*****1.34 0.84 .37*****1.28 0.81 0.39
1.75 1.01 0.45*****1.57 1.11 .45*****1.50 1.10 0.43
1.68 1.04 0.52*****1.74 1.09 .50*****1.81 1.13 0.54
1.96 1.22 0.51*****1.94 1.26 .57*****1.92 1.32 0.67
1.96 1.53 0.62*****2.27 1.35 .68*****2.24 1.65 0.62
2.57 1.85 0.78*****2.54 1.88 .87*****2.63 1.71 1.00

......4...............5..................6..................
0.56 0.31 0.14*****0.52 0.29 .12*****0.49 0.27 0.10
0.95 0.60 0.29*****0.92 0.56 .25*****0.89 0.53 0.23
1.15 0.74 0.39*****1.13 0.64 .36*****1.17 0.66 0.34
1.31 0.96 0.42*****1.38 0.92 .32*****1.25 0.92 0.41
1.62 1.04 0.53*****1.55 1.01 .50*****1.45 0.93 0.44
1.98 1.29 0.60*****1.92 1.25 .62*****1.67 1.18 0.53
2.21 1.52 0.72*****1.95 1.42 .68*****2.01 1.42 0.64
2.60 1.71 0.79*****2.33 1.78 .88*****2.05 1.64 0.83

......7...............8..................9..................
0.49 0.26 0.09*****0.51 0.25 .09*****0.58 0.28 0.08
0.82 0.51 0.23*****0.88 0.50 .21*****0.89 0.55 0.21
1.10 0.65 0.32*****1.13 0.66 .29*****1.22 0.71 0.31
1.43 0.91 0.34*****1.16 0.92 .34*****1.33 0.92 0.37
1.55 0.83 0.46*****1.49 0.86 .43*****1.56 0.91 0.42
1.79 1.11 0.48*****1.73 1.22 .48*****1.81 1.22 0.51
1.92 1.33 0.61*****1.99 1.34 .63*****1.97 1.28 0.55
2.07 1.54 0.81*****2.22 1.48 .79*****2.48 1.53 0.76

NL
......1...............2..................3..................
0.60 0.33 0.10*****0.61 0.33 .13*****0.57 0.34 0.13
0.96 0.61 0.22*****1.04 0.62 .27*****1.04 0.64 0.32
1.14 0.78 0.32*****1.33 0.73 .37*****1.22 0.80 0.40
1.06 0.97 0.38*****1.54 1.03 .39*****1.57 1.13 0.41
1.71 0.97 0.42*****1.66 1.05 .48*****1.64 1.09 0.55
2.19 1.23 0.52*****1.91 1.29 .60*****1.89 1.27 0.60
2.01 1.45 0.58*****2.13 1.49 .62*****2.09 1.62 0.71
2.50 1.49 0.81*****2.80 1.69 .73*****2.56 1.77 0.95

......4...............5..................6..................
0.49 0.30 0.13*****0.45 0.24 .10*****0.41 0.23 0.08
0.94 0.63 0.27*****0.83 0.56 .26*****0.77 0.46 0.21
1.19 0.77 0.35*****1.04 0.73 .37*****1.00 0.64 0.33
1.22 1.03 0.42*****1.17 0.93 .39*****1.18 0.89 0.37
1.52 1.06 0.56*****1.43 0.99 .49*****1.28 0.91 0.46
1.80 1.25 0.58*****1.82 1.18 .55*****1.63 1.13 0.52
2.15 1.41 0.67*****2.01 1.46 .56*****1.85 1.27 0.59
2.32 1.60 0.84*****2.21 1.68 .80*****2.34 1.55 0.79

......7...............8..................9..................
0.41 0.18 0.08*****0.46 0.19 .06*****0.50 0.24 0.06
0.74 0.43 0.18*****0.72 0.37 .17*****0.85 0.41 0.13
0.96 0.55 0.30*****1.00 0.53 .26*****1.05 0.56 0.22
1.44 0.84 0.37*****1.42 0.86 .32*****1.21 0.89 0.28
1.38 0.73 0.43*****1.32 0.78 .34*****1.44 0.75 0.33
1.50 0.98 0.47*****1.60 0.90 .39*****1.71 0.88 0.34
1.79 1.16 0.55*****1.93 1.27 .46*****1.77 1.18 0.44
1.92 1.47 0.74*****2.12 1.33 .70*****2.14 1.36 0.54

9 (pitchers only)

0.48 0.21 0.05
0.81 0.36 0.12
0.97 0.52 0.18
1.06 0.84 0.26
1.48 0.65 0.27
1.76 0.75 0.28
1.68 1.06 0.36
2.07 1.10 0.44

As you can see, the RE's by batting order suffer, to some degree, from "small sample-itis," especially the states that are less common...

IP

tangotiger posted October 15th, 2000 10:49 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

Thanks, MGL, the small sample-itis problem doesn't affect me, since I combine the RE with the frequency.

Anyway, using the RE and frequency tables for each batting spot, the leadoff hitter looks like this:
Chance of scoring from each base: 28.4%, 49.0%, 73.4%

Looking at the #7 hitter: 24.0%, 41.3%, 59.8 %

These percentages are based on the frequency and the RE for the batting spot.

However, if we apply an average frequency, and simply allow the RE to be variable (thereby giving the whole credit for the chance of scoring based on the real batters following them), we get:
leadoff: 25.4%, 44.1%, 62.9%
#7 hitter: 24.2%, 41.8%, 60.9%
Overall (previously posted): 26.5%, 43.4%, 62.1%

Now, look at this. The leadoff hitter,
with real frequency, and real batters: 28.4% chance of scoring from 1B
with avg frequency, and real batters: 25.4% chance
with avg, and avg: 26.5% chance

This means that the batters behind the leadoff hitters do a POOR job, worse than average of moving them over. Just to be sure, here are the chance of scoring for leadoff from 1B, given real frequency, but average batters: 28.9%

There you have it. Just because you have better batters behind you, does not mean you will have a better chance of scoring.



IP

David Smyth posted October 15th, 2000 02:00 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

I think it's to put all this stuff to the test. In an earlier post Tango came up with an ideal batting order for the 1986 Red Sox according to earlier L Wts by B.O numbers:

1)Boggs or Evans
2)Evans or Boggs
3)Gedman
4)Rice
5)Buckner
6)Baylor
7)Armas
8)Romero
9)Barrett

I'll do an analysis now, the way I've always done it. I first put the players in order of hitting ability (here I'll use OTS*34), along with the SLG/OBA ratio as an indicator of style.

1)Boggs........7.49 R/G.....1.07 SLG/OBA
2)Rice.........6.40.........1.28
3)Evans........6.09.........1.27
4)Baylor.......5.13.........1.28
5)Barrett......4.57.........1.08
6)Gedman.......4.54.........1.35
7)Buckner......4.45.........1.35
8)Armas........4.24.........1.35
9)Romero.......2.65.........1.07

Right off the bat, Tango's use of Gedman at #3 seems strange. This is the 'Earnshaw Cook' batting order. I'ts usually not the optimal lineup for a real team, but it's the appropriate place to begin. Next I look to see if I should rearrange the first 3 (sometimes 4) batters according to style differences. In this case, the first batter already has the best lead-off profile (low SLG/OBA), and the next 3 have about the same style ratios. I see no reason to move anyone. The last 5 batters should generally be left in order of ability. One thing to consider, though, is whether the 9th batter should be shifted to 8th (the second leadoff man theory). This team is as strong an indication for doing so as you're likely to find, because of the very strong top of the order. So Romero moves to 8th, and the obvious choice to replace him is Barrett.

1)Boggs
2)Rice
3)Evans
4)Baylor
5)Gedman
6)Buckner
7)Armas
8)Romero
9)Barrett

It will be interesting to see what the latest procedure posted by Tangotiger will result in. Also it would be interesting to check out a sim study for the 1986 BoSox.

IP

> rate this topic: 1: Worst 5: Best (5 is best)