Thursday, July 03, 2014

MLB Over/Unders And The Empirical Data

In my previous post I used my simulator to come up with a set of equations to convert an MLB Over/Under to an average runs scored per game number. Basically, a conversion tool to go from the median to mean for runs scored in a game. In this post I am going to show what the actual empirical data looks like based off of the 1266 games played so far. Obviously, the sample size here will be problematic. The next step will be to add data from previous seasons to the data that I have for the current 2014 season. I may or may not be able to do this but here is the 2014 data nonetheless. And keep in mind this data is not taking into account the odds or percentage chance of the game going over or under. It is assuming that all games have a 50/50 chance of going over or under, which is wrong but it should even out a little bit.

Over/UnderCountAverage RPG

As you can see the sample size problem makes this data pretty close to unusable. And that is part of what I am trying to show here. What I would expect to see in the "Average Runs Per Game" column of the table had the sample size been in the tens of thousands is a number about 0.45 higher than the over/under number. Our largest sample size is the over/under of 7.5 and the average runs scored per game is 0.59 higher than the over/under.

No comments: