SABR
GAMES and SIMULATIONS
COMMITTEE

Batting Order Optimization

… or, Everybody Needs A S[t]imulating Hobby

A Little History …

I’d been playing roto and salary cap baseball games online for about 15 years when I realized that I needed to be able to determine a player’s relative value better. I was pretty handy at Excel (or so I thought) but I was doing it the hard way …. there was a lot of cut-and-paste going on. I spent a lot of time figuring out how to determine salary cap value and how to draft more efficiently by finding players whose actual performance was expected to exceed their projected performance … and by better determining where that player should be drafted or what his auction price should be.

About this time, I became aware of “Baseball Hacks” by Joseph Adler … and fell into the rabbit-hole. It did more than just light a fuse. Suddenly, the whole world of MySQL and PERL was opened up before me. I started to retrieve data from MLB’s Gameday server, and spent lots of time working on improving methods for player valuations: for rotisserie drafts and auctions, and for salary-cap games. I was having a ton of fun but knew there was still more. I was aware of R, but hadn’t taken the plunge. Then “Analyzing Baseball Data With R” by Max Marchi and Jim Albert came out. Well, I fell into the rabbit-hole’s rabbit-hole … particularly when I read the chapter on simulation.

About the same time, I was starting to tire of rotisserie. The time it was taking to compete against the big dogs was becoming an issue, particularly on Sundays – which would be consumed by trying to stay up-to-date on last-minute injury issues among other things. I also was bothered by watching games for player performances instead of just enjoying the game itself.

I was also starting to look forward to retirement and was trying to come up with something to do when I wasn’t playing or practicing golf. As I said, the chapter on simulation pretty much made my head explode. Somewhere along the line, as I played around with the scripts provided by the book, it struck me that if you could simulate a team’s performance over the course of a season, why couldn’t you do the same for a batter? And then it also struck me, that if you could simulate a batter’s performance, you should be able to give some thought to where they should be in the batting order … which leads to a whole boatload of questions such as: what is the best batting order?, do teams use the best batting order?, how have orders changed over the years?, can you characterize the type of batter for a particular spot in the order?, and so on …

Batting order debates are timeless, and so is the decision about when to change pitchers

I love to play bridge … and one of the things you need to learn is that you can arrive at the correct contract but card distribution can mean that you won’t make it no matter what. It can also mean that you will make several overtricks, but you could never have justified the higher bid. Another thing is the finesse … sometimes it’s on, sometimes it isn’t … and there are times when it just isn’t worth it. But there are also times where you just have a hunch … kind of like a pitcher and the third time through the order. Similarly, you can throw the ‘best’ batting order out there game after game, but there are games where it just isn’t going to get it done.

How to find [batting] order from chaos

I’m interested in modelling batting order to compare how [presumably] better orders might perform compared to actual orders used. I’m also interested in looking at how batting orders have changed over the years.

The guts of all of this is “s.new <- sample(1:25, 1,prob=P[s, ])” from “Analyzing Baseball Data With R” (p. 217 of the first edition). I’ve also modified the code for calculating transition probabilities … to determine the probabilities for each batter. I like the idea of using a simulation that uses a generated random number … if dice are good enough for all the simulation games out there, a similar method will work for me. Another issue for me is that we skipped matrices in high school math and it’s a little late for learning too many new tricks … notwithstanding the fact that you can’t divide (which I need to do when simulating games considering pitchers and defenses).

The simulation in “Analyzing Baseball Data With R” is for one inning. I rewrote the code to make it for nine. I just needed to figure out how to track the last out of any inning and have the correct batter lead off the next. Another issue is that the order stays the same for all nine innings … not a very common scenario in the real world, but this is a simulation.

Any group of 9 batters has 362,880 possible combinations. The chance that only one of these will be significantly better than all the rest is pretty remote, but it seems reasonable to me that those combinations that do yield the best results might provide a consensus order … one that, if nothing else, would make the best starting point. Also, as I discovered, the time to simulate all combinations and yield a reproducible result is quite staggering. I changed from scripting this entirely in R to PERL in the hopes that I could use GPU processing (which is still the grail I hope for). Either way, it takes an 8-core, 4 GHz PC with 48 GB RAM about 14 hours.

Allowing a simulation to run for 14 hours is very distressing. If the power goes out, you start over. If something dopey like USB power management shuts things down, you start over. It’s also an awfully long time to wait for an answer. So for now, I have decided to ‘cheat’ with a workaround. Any studying of all combinations is restricted to the years before the DH when the pitcher [almost] always batted ninth. Restricting the possible combinations to 8 players (with the pitcher batting ninth all the time) reduces the number of combinations to 40320. An analysis can be completed in about 1.5 hours: much more palatable.

So … Why did the 1961 Yankees have Bobby Richardson and Tony Kubek at the top of the order?

At the last meeting, a question was asked about the 1961 Yankees using Bobby Richardson and Tony Kubek at the top of the order. So I have taken a look:

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at “www.retrosheet.org”.

Of the eight most-regular batters, Richardson (0.295) and Kubek (0.306) had the lowest OBPs for the season. Retrosheet play-by-play data for 1961 was retrieved and stored in a MySQL table. Transition probabilities were calculated using the method (and code) in Marchi and Albert. Individual batter probabilities were retrieved by database query and massaged by an Excel macro into a table equivalent to the one used by the probability function in R.

The order used most often was determined by querying Retrosheet data. I’m struck by just how rarely a team uses it’s #1 order. The 1961 Yankees most-used orders were used 11, 10, and 10 times during the season … although #2 and #3 are very similar. In fact, the 1961 Yankees used 67 !! different orders. The most common are:

1
2
3
4
5
6
7
8
games
boyec102kubet101marir101mantm101berry101skowb101howae101richb10211
richb102kubet101marir101mantm101berry101howae101skowb101boyec10210
richb102kubet101marir101mantm101howae101skowb101lopeh101boyec10210
richb102kubet101marir101mantm101berry101skowb101howae101boyec1027
kubet101lopeh101marir101mantm101berry101skowb101boyec102richb1027
richb102kubet101marir101mantm101blanj101howae101skowb101boyec1026
boyec102kubet101marir101mantm101berry101skowb101blanj101richb1026
richb102kubet101marir101mantm101berry101skowb101blanj101boyec1026
Batting orders used by the 1961 Yankees

So, for the 1961 Yankees, I chose order #2 as it has Richardson leading off followed by Kubek and it has both Berra and Howard in the lineup. These are the eight who played most often (more or less) as determined by a query of all orders used for all 162 games. These are also the top eight batters listed at Baseball Reference. This order was used to generate worst-to-best and best-to-worst orders by running a simulation where all nine batters are the same: in other words, 9 Bobby Richardsons, 9 Tony Kubeks, etc. … which generated the following:

order #2rpg
worst-to-bestrpg
best-to-worstrpg
richb1023.49
shelr1013.28
mantm1017.67
kubet1014.05
richb1023.49
marir1016.81
marir1016.81
boyec1023.80
howae1016.04
mantm1017.67
kubet1014.05
berry1015.00
berry1015.00
skowb1014.80
skowb1014.80
howae1016.04
berry1015.00
kubet1014.05
skowb1014.80
howae1016.04
boyec1023.80
boyec1023.80
marir1016.81
richb1023.49
shelr1013.28
mantm1017.67
shelr1013.28
Runs per game for nines of the same batter

So far, I have three orders. 1. Most common, 2. Worst-to-best, and 3. Best-to-worst. Now to find the consensus ‘best’ from a simulation test of all combinations.

There are 40320 possible combinations. I used R to generate the list of combinations and added 9 to the last position in each row. So combination 1 is 1,2,3,4,5,6,7,8,9 – combination 2 is 1,2,3,4,5,6,8,7,9 and so on. I divided this file into eight equal lists of 5040 each. A batch file then runs all eight at the same time, dedicating one instance to each of the eight cores. Results were written to eight separate database tables and then combined.

I decided to take those orders where the number of runs generated was greater than two standard deviations from the mean. This yields approximately 800 orders out of the 40320 tested.


123456789
richb1020066131481551462300
kubet101003791615761621100
marir10120849192631393529160
mantm10121196133166613330280
berry101117256182111751231280
howae1013626974259913864530
skowb1012141832591177750560
boyec10278892934681691541370
shelr10100000000758

and it seems to me that the consensus order should be the maximum-sum path through the table. So, I wrote an R script to find this path (it’s not always immediately obvious to the naked eye).

And the maximum-path order is:

mantm101howae101kubet101marir101skowb101boyec102berry101richb102shelr101

Now there are five combinations for further testing:

  1. The most common order used during the season
  2. The worst-to-best order
  3. The best-to-worst order
  4. The test-consensus order, and
  5. The reverse of the test-consensus order (I expect it to be a low-scoring order)

These five combinations were retested: because there are only 5 combinations instead of 5040, it takes a lot less time … so I ran each simulation 100,000 times.

The results are:

orderrpgr/162
most common4.88790
worst-to-best4.79777
best-to-worst5.01812
test consensus5.10827
test consensus reversed4.85785

Interestingly, the 1961 Yankees scored 827 runs. The fact that the test-consensus order produced the same total is a coincidence. I thought there might be a greater difference between presumed good and bad orders, but still, 0.32 runs per game is 52 runs over a season.

The whole business is fascinating (at least to me) and I’m encouraged by the fact that the simulation does yield a run-total in line with the actual total of that season. Pretty sure I’d have raised more than a few eyebrows batting the Commerce Comet at leadoff, but then he did have a 0.448 OBP!

So, where do I hope to go:

Well, I’d like to find out about all teams over all seasons and see if order spots can be characterized by batter type. That’s going to require access to some serious computer horsepower … perhaps in the cloud, but perhaps GPU processing (I’m always hopeful) … or, perhaps, by some revelation and guidance about matrices.

I’d also like to look at teams in the DH era as well, but the additional batter (not the pitcher) means I can’t take the shortcut.

All this is going to have to wait until I find a faster way to do the simulation. Either way, it’s all fun and keeps me out of trouble.

Acknowledgements

Thanks to Derek and Mark for providing the opportunity to post this article. And the biggest of shout-outs (shouts-out?) to Retrosheet for historical data (and to MLB for making current data readily available), “Baseball Hacks” by Joseph Adler, “Analyzing Baseball Data With R” by Max Marchi and Jim Albert, and “The Book” by Tom Tango, Mitchel Lichtman, and Andrew Dolphin.

I do this for fun. The fun is mine, and any mistakes in methodology or logic are also mine.

If there’s interest, I’ll post about how I simulate a game.

Thanks for reading :-þ

+ posts
0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments