The files below contain Matlab code and a sample data set (the 2001 Atlanta Braves) which students and instructors can use as a supplement to the Markov chain lesson described in the paper "An Intuitive Markov Chain Lesson From Baseball" (Sokol, Informs Transactions on Education 2004). The code and data set use a simple event model to make it easy for users to create and analyze their own batting order data sets.
League data: This file contains league total data that can be used to calculate transition probabilities, situational values, event values, etc. Several leagues' totals are included. Note that only basic data is included here, so that students do not get bogged down in the details of baseball. Results using very detailed data (errors, baserunner advancement probabilities, steals, etc.) are not very different from those obtained with the basic data. The columns in this data file follow the same format as described below in "Creating New Data Sets".
Matlab code: The Matlab files in this zip archive contain the Markov chain calculations necessary to evaluate batting orders.
Data file: This file contains the input data for the 2001 Atlanta Braves. The instructions below describe how users can easily create their own data sets.
Data sets should have 9 rows (one for each player) of 7 columns each:
As mentioned above, the model used in the Matlab code is simplified to include only these basic events, so that users can easily create their own data sets from current or historical data.
Please send comments, bugs, etc. to Joel Sokol