Not much activity here of late in terms of blog posts, but plenty of calibrating and testing going on behind the scenes for my baseball simulator. I have been back testing a few code changes both in and out of sample for the previous seasons. I am excited about the results I've been getting and I am really looking forward to the 2011 season getting started. I may have a few "just for fun" posts between now and the start of the season, when I begin posting the simulation results of actual games.
For now, I'd like to kick things off with a post documenting the win-loss records of all 30 teams from the August 1st until the end of the season, roughly the last 38 games for each team. I believe that over a 2 month period there are some interesting talent trends that begin to form. One of the most important calibrations for my simulator is determining the correct weighting of past performance in an attempt to best predict current true talent levels. You do not want to weight recent performance to heavy to the point that you are putting too much of an emphasis on good/bad luck. On the other hand, you do not want to give too much weight to past, or old performance to the point that you will miss out on a change in a players true talent level. A formula which strikes the best balance between the two is very challenging and requires a careful set of tests on an in and out of sample set of games. My measuring stick is always the ROI (return on investment) of bets against the Vegas money lines. While I am always looking for flaws in my methodology and ways to improve my results, I believe I have found a good balance between past and recent performance. Thus my interest in the records of each team over the last two months of the season. The table(s) below provide some interesting data, even though some teams have changed more than others this off season, it is interesting to be reminded of how each team finished off the last portion of the season.
2011 Standings (Aug 1st - Oct 3rd) | ||||
Team | Div | Wins | Losses | WPct |
PHI | NLE | 40 | 17 | .702 |
MIN | ALC | 36 | 22 | .621 |
BAL | ALE | 33 | 24 | .579 |
CIN | NLC | 33 | 24 | .579 |
SF | NLW | 32 | 25 | .561 |
TB | ALE | 32 | 27 | .542 |
ATL | NLE | 32 | 27 | .542 |
HOU | NLC | 32 | 27 | .542 |
TOR | ALE | 31 | 27 | .534 |
BOS | ALE | 30 | 28 | .517 |
CHA | ALC | 30 | 28 | .517 |
MIL | NLC | 29 | 28 | .509 |
TEX | ALW | 29 | 29 | .500 |
CHN | NLC | 29 | 29 | .500 |
COL | NLW | 29 | 29 | .500 |
SD | NLW | 30 | 30 | .500 |
NYA | ALE | 29 | 30 | .492 |
OAK | ALW | 29 | 30 | .492 |
DET | ALC | 28 | 30 | .483 |
STL | NLC | 28 | 30 | .483 |
LAA | ALW | 27 | 29 | .482 |
FLA | NLE | 27 | 30 | .474 |
ARI | NLW | 27 | 31 | .466 |
CLE | ALC | 26 | 32 | .448 |
NYN | NLE | 26 | 32 | .448 |
LAN | NLW | 26 | 32 | .448 |
KC | ALC | 23 | 35 | .397 |
WAS | NLE | 23 | 35 | .397 |
SEA | ALW | 22 | 35 | .386 |
PIT | NLC | 21 | 38 | .356 |
Now how about a look at how the playoffs would have looked like using only the last two months of the season as final win-loss records. The National League ends up with the exact same four teams (PHI, CIN, SF, ATL) that actually ended up making the playoffs. In the American League, we end up with three of the four teams (MIN, TB, TEX) that actually made the playoffs. The one American League team, and the only one out of eight playoff teams in both leagues that did not make the playoffs was the Yankees, who were replaced with the Orioles who had tied the third best record of any team over the last two months of the season.
American League | |
Rays | vs Twins |
Rangers | vs Orioles |
  | |
National League | |
Giants | vs Phillies |
Braves | vs Reds |
So our one surprise team is the Orioles. What happened to their team? Did their players suddenly improve their true talent levels? Or did their luck just improve? It is a likely combination of the two. The important part is obviously being able to measure how much of it was luck and how much of it was an actual positive talent trend. The early 2011 simulations will likely show some favorable Oriole predictions. It will be interesting to see how a team like the Orioles does during the first month or two of the season in the tough AL East division.
Next up, I will begin to look at some head to head two team matchups as the rosters begin to shape up. Also I will have a 2011 "crazy prediction" post, where I make a crazy and "just for fun" prediction for each team, just as a break from the usual logic driven posts I make.
2 comments:
Do you ever evaluate your simulator based on how well the predicted probabilities correlate with observed win probabilities? ...i.e. for games that you predicted a 55% chance of the away team winning, how often that team actually won...you would hope to see something close to 55%. I have been trying to think of the best measuring sticks for simulator performance, and I thought of this method as an alternative to ROI, though ROI is ultimately what we are all after :). What do you think?
Ballas, I think it would be a good alternative measuring stick. And to take it a step further you could do the same for Vegas odds and see which does better. But like you mentioned above, what we are ultimately after is a good ROI. I am not sure if my sample of 55% games are large enough, so in your example I would probably have to open it up to something like 48-52, 52-56, 56-60 etc...
Great comment. I have all the data to do this, I just need to sit down and do it. I will try to get something out before the season starts.
Post a Comment