Sunday, October 31, 2010

Giants vs Rangers, World Series Game #5 Simulation



November 1st Simulations, World Series Odds ...



Visitors Home Probable Pitching Matchup Favorite Vegas Win Prob Simulator Win Prob Actual
Giants Rangers T.Lincecum vs C.Lee Rangers 62.26% 57.96% SF 3-1


Final Regular Season ROI: 3.54%
Final Post-Season ROI: 42.07%

Simulation details at
Lonestar Ball
McCovey Chronicles

MLB Scoreboard

6 comments:

igsportspicks said...

Cool stuff I am was in a research work compiling websites which engaged in sports updates and sports handicapping.

Thanks.

obsessivegiantscompulsive said...

I find it funny (coincidence) that by your simulation, the Rangers should have won the World Series 4-1 whereas the reality is that the Giants won it 4-1.

I'm curious how you run your simulations. Do you use a game? Some mathematical functions? I noticed you mention best guess lineups, so obviously there is a good level of sophistication to your simulation.

I've been looking to do simulations too. How did you select your simulator? Is there a good article somewhere? A good review?

Thanks for any help you can provide. Happy Holidays!

Xeifrank said...

ogc, actually the simulator had the Giants favored in three (G1, G2, G4) games and the Rangers favored in two games. The team listed as the "Favorite" is actually the "Vegas" favorite. The simulator only favors that team if the simulator number is greater than 50%. If it is less than 50% it is actually favoring the other team.

Vegas: Rangers(4), Giants(1)
Simulator: Rangers(2), Giants(3)

Also, measuring the accuracy of a simulator or any handicapping system is not best measured solely off of picking the correct team to win. The best measuring stick is to compare your system to Vegas. If Vegas has the Rangers as 60% favorites and the simulator says it thinks the Rangers should be 54% favorites and the Giants win the game. The simulator has actually done well and this would be considered a "win". If the simulator had said the Rangers had a 65% chance in this example then it would've been a loss because the simulator was on the "Rangers side" of the Vegas line.

That being said four of the five W.S. games had slight to significant differences between the simulator and Vegas and the simulator won all four of those contests vs Vegas. Game #2 would've been a "no bet" as both the simulator and Vegas were in agreement as to the odds.

As far as the simulation, I programmed my own baseball simulator in C/C++. It plays actual games. When doing regression testing on a season worth of games, I have each game played 10,000 times. If I am doing an individual game, I have each game played 100,000 times. The more games you play, the less the margin of error is based off of sample size.

There really is no article on baseball simulators. You just have to do it yourself. There exists simulation programs like Diamond Mind that people use more for entertainment purposes, but you have no control over the inner-workings of the simulation.

There is a lot of backtesting that goes into my simulator and it uses many principles of advanced sabermetrics to create a more realistic result.

Thanks for the question/comment.

obsessivegiantscompulsive said...

Sorry, I was not trying to criticize or mock your simulator results, was just noting the serendipity of the reversal between the results I saw in the tables and the actual results.

Sorry, did not realize that was the Vegas prediction, I did not think that you would be noting what they say and then not note what your simulation was saying, so I assumed it was your simulation results. My fault for not paying enough attention to the details.

Very cool that you programmed yourself in C! I didn't care much for C after I programmed an assignment (correctly I thought), ran it by my TA and he couldn't find the problem either. Maybe C++ fixed that problem.

While I can see your point about measuring the accuracy of the simulator vs. that of Vegas, that is true only if that is the sole purpose of the simulator is to beat Las Vegas odds. On that measure, you certainly kicked their pants off (heck would have stole them!).

However, most people simulate to recreate reality in the most realistic way and thus would compare the results of the simulation with reality.

Thank you for your explanation, I greatly appreciate the information. I thought perhaps you were using a simulator and thus would have researched which to use. I am greatly impressed that you did it on your own!

I am aware of Diamond Mind, and if I were to get a simulator, that is probably the one I would get, without doing any research. While I would not get to program the inner workings of the simulator - I totally get wanting to control all the parts of it - I have read and respected the works of the creator, Tom Tippett, and think he is one of the better sabers around. So I have faith that he and his team would incorporate the best and latest in advanced sabermetric findings in the product (though, to be honest, I don't know if he's still involved after he sold the company, I would have to check that first before buying), at least since I don't have the time to put together my own simulator.

I'm curious, how long does it take you to set up your simulator for a game, and then how long does it take the program to run before spitting out the data?

Lastly, you noted that your simulator had the Giants up 3-2 after 5 games. I am curious if you bothered to simulate games 6 & maybe 7 to see who would have won the World Series, based on your simulator's calculations.

Not that it matters to you, but I'm very impressed by your work, thank you for sharing your information with me.

Xeifrank said...

ocg, it doesn't take me too long to setup for one game. If there are 15 games scheduled for one day and I want to run the simulations the night before of morning of the games, I have to make a best guess estimate of the actual lineups (as I work during the day). This does take some research reading roto sites for injuries and probables.

The best measure of accurately recreating reality is to beat the Vegas odds. Vegas lines are moved by some very smart people. If you come closer to modelling reality (actual results) than they do, then you did a great job with your sim or system.

Another very important facet of the simulator has nothing to do with the simulator itself and that is the hitter, pitcher, fielding projections and park factors that you input into the simulator. It is fascinating to find out what kind of weighting on hitters and pitchers to use to have the simulator post good results with. The weighting that is the most successful is nothing close to the standard 3 year 5/4/3 weightings that you see from some of the projection systems like Marcel.

My simulator can simulate one game 10K times in less than 3 seconds. And that is with my modestly slow laptop.

Btw, I review all comments before publishing as I tend to get some spam if I don't do that. So if you post a comment you won't see it until I read it and publish it. :)

obsessivegiantscompulsive said...

Yeah, sorry about all the posts, as I noted, my Internet Explorer died while trying to load the comment, so then I tried again, just in case it didn't get through, and then died again...

But I did know that you moderate, and that's why I wasn't sure if it got through or not, and thus I tried again, just in case.

Thanks for the info about your simulator. Amazing what computers can do nowadays! :^) Great job on your simulator! I see your point about beating Vegas.

Happy Holidays!