Validation of LINEUP

The 1995 version of LINEUP was used during 389 1994 TBA games. The observed record was 183-136 (0.574), and the predicted winning % via LINEUP Gaussian W% was 0.585 or roughly 187 wins. Since the predicted standard deviation for these games was about 8.5 wins (assuming a one sample binomial model for game-to-game winning percentage), the LINEUP prediction is well within the 'uncertainty' of the data, especially considering that the prediction ignores bench and bullpen strength. For the same set of games, the Pythagorean values were 0.571 and 0.567 for the basic and Gaussian RC respectively.

Using the SOM computer game and the 1993 season data, a fixed pair of lineups were played against each other over 11,000 times. The computer manager was set to highly conservative settings for stealing and strategy plays, and players with poor stealing ability (but average base running) were used. Also, the starting pitchers were set to 9 endurance and the early fatigue rules were disabled. Finally no substitutions were possible (both benches and bullpens were empty, and the rotations were 4 identical pitchers). The observed winning % was 0.610, with a standard deviation of approximately 0.005. LINEUP gave 0.610 (Gaussian RC) and 0.613 (regular RC) for the Pythagorean winning % values and 0.639 for the Gaussian winning % value. This means that the Pythagorean values have less that a 5% chance of being inaccurate given this data, while the Gaussian winning % value is almost certainly optimistic (e.g., tends to be too far from 0.500), or the chance that Gaussian winning % is indeed accurate is less than 1%.

These and other experiments have lead us to the following conclusions.

1. The Pythagorean values are undoubtedly more accurate as the game becomes one-sided.

2. The Gaussian value, although generally optimistic, tends to do better in predicting close games. This is not a clear cut result because it takes an enormous number of trials to tell two close teams apart (>1,000,000 to get the standard deviation of W% down to 0.0005 !) with high confidence, and we just don't have enough data yet.

3. LINEUP Pythagorean winning % estimates are accurate to 2 significant digits, and the noise is somewhere around + or - 0.005 (i.e., a substitution that produces a 0.010 change is close to a wash).

Based on all of this, you may ask why do we include the Gaussian winning % ? For two reasons: (1) it has far better theoretical support than the empirical Pythagorean measure, and (2) it is much more amenable to small refinements as we increase the accuracy of our SOM runs created model. We anticipate eventually improving it's accuracy to exceed that of the Pythagorean values.

Overall there is strong evidence for LINEUP as a statistically accurate predictor of W% in SOM baseball.