Monday, January 10, 2011

Radio Silence

Not much activity here of late in terms of blog posts, but plenty of calibrating and testing going on behind the scenes for my baseball simulator. I have been back testing a few code changes both in and out of sample for the previous seasons. I am excited about the results I've been getting and I am really looking forward to the 2011 season getting started. I may have a few "just for fun" posts between now and the start of the season, when I begin posting the simulation results of actual games.

For now, I'd like to kick things off with a post documenting the win-loss records of all 30 teams from the August 1st until the end of the season, roughly the last 38 games for each team. I believe that over a 2 month period there are some interesting talent trends that begin to form. One of the most important calibrations for my simulator is determining the correct weighting of past performance in an attempt to best predict current true talent levels. You do not want to weight recent performance to heavy to the point that you are putting too much of an emphasis on good/bad luck. On the other hand, you do not want to give too much weight to past, or old performance to the point that you will miss out on a change in a players true talent level. A formula which strikes the best balance between the two is very challenging and requires a careful set of tests on an in and out of sample set of games. My measuring stick is always the ROI (return on investment) of bets against the Vegas money lines. While I am always looking for flaws in my methodology and ways to improve my results, I believe I have found a good balance between past and recent performance. Thus my interest in the records of each team over the last two months of the season. The table(s) below provide some interesting data, even though some teams have changed more than others this off season, it is interesting to be reminded of how each team finished off the last portion of the season.

2011 Standings (Aug 1st - Oct 3rd)
Team Div Wins Losses WPct
PHI NLE 40 17 .702
MIN ALC 36 22 .621
BAL ALE 33 24 .579
CIN NLC 33 24 .579
SF NLW 32 25 .561
TB ALE 32 27 .542
ATL NLE 32 27 .542
HOU NLC 32 27 .542
TOR ALE 31 27 .534
BOS ALE 30 28 .517
CHA ALC 30 28 .517
MIL NLC 29 28 .509
TEX ALW 29 29 .500
CHN NLC 29 29 .500
COL NLW 29 29 .500
SD NLW 30 30 .500
NYA ALE 29 30 .492
OAK ALW 29 30 .492
DET ALC 28 30 .483
STL NLC 28 30 .483
LAA ALW 27 29 .482
FLA NLE 27 30 .474
ARI NLW 27 31 .466
CLE ALC 26 32 .448
NYN NLE 26 32 .448
LAN NLW 26 32 .448
KC ALC 23 35 .397
WAS NLE 23 35 .397
SEA ALW 22 35 .386
PIT NLC 21 38 .356

Now how about a look at how the playoffs would have looked like using only the last two months of the season as final win-loss records. The National League ends up with the exact same four teams (PHI, CIN, SF, ATL) that actually ended up making the playoffs. In the American League, we end up with three of the four teams (MIN, TB, TEX) that actually made the playoffs. The one American League team, and the only one out of eight playoff teams in both leagues that did not make the playoffs was the Yankees, who were replaced with the Orioles who had tied the third best record of any team over the last two months of the season.

American League
Rays vs Twins
Rangers vs Orioles
National League
Giants vs Phillies
Braves vs Reds

So our one surprise team is the Orioles. What happened to their team? Did their players suddenly improve their true talent levels? Or did their luck just improve? It is a likely combination of the two. The important part is obviously being able to measure how much of it was luck and how much of it was an actual positive talent trend. The early 2011 simulations will likely show some favorable Oriole predictions. It will be interesting to see how a team like the Orioles does during the first month or two of the season in the tough AL East division.

Next up, I will begin to look at some head to head two team matchups as the rosters begin to shape up. Also I will have a 2011 "crazy prediction" post, where I make a crazy and "just for fun" prediction for each team, just as a break from the usual logic driven posts I make.


Ballas said...

Do you ever evaluate your simulator based on how well the predicted probabilities correlate with observed win probabilities? ...i.e. for games that you predicted a 55% chance of the away team winning, how often that team actually would hope to see something close to 55%. I have been trying to think of the best measuring sticks for simulator performance, and I thought of this method as an alternative to ROI, though ROI is ultimately what we are all after :). What do you think?

Xeifrank said...

Ballas, I think it would be a good alternative measuring stick. And to take it a step further you could do the same for Vegas odds and see which does better. But like you mentioned above, what we are ultimately after is a good ROI. I am not sure if my sample of 55% games are large enough, so in your example I would probably have to open it up to something like 48-52, 52-56, 56-60 etc...

Great comment. I have all the data to do this, I just need to sit down and do it. I will try to get something out before the season starts.