Sunday, July 13, 2014

Masahiro Tanaka At The All-Star Break

Here is a list of the Vegas odds of all 18 of Masahiro Tanaka's starts up to the All-Star break.

DateAwayHomeAway PitcherHome PitcherFaveML FaveML DogTanaka Win%
4/4/2014NYATORMasahiro TanakaDerek McGowanNYA-12612155.26
4/9/2014BALNYAMiguel GonzalezMasahiro TanakaNYA-17516863.17
4/16/2014CHNNYAJason HammelMasahiro TanakaNYA-20018565.81
4/22/2014NYABOSMasahiro TanakaJon LesterBOS-11610647.39
4/27/2014LAANYAGarrett RichardsMasahiro TanakaNYA-15915461.01
5/3/2014TBNYAJake OdorizziMasahiro TanakaNYA-19518765.64
5/9/2014NYAMILMasahiro TanakaYovani GallardoNYA-12912455.85
5/14/2014NYANYNMasahiro TanakaRaul MonteroNYA-16515961.83
5/20/2014NYACHNMasahiro TanakaJason HammelNYA-15815360.86
5/25/2014NYACHAMasahiro TanakaAndre RienzoNYA-16415861.69
5/31/2014MINNYAKevin CorreiaMasahiro TanakaNYA-24823870.85
6/5/2014OAKNYADrew PomeranzMasahiro TanakaNYA-13712756.90
6/11/2014NYASEAMasahiro TanakaChris YoungNYA-18517564.29
6/17/2014TORNYAMarcus StromanMasahiro TanakaNYA-16716062.05
6/22/2014BALNYAChris TillmanMasahiro TanakaNYA-20219266.33
6/28/2014BOSNYAJon LesterMasahiro TanakaNYA-15014559.60
7/3/2014NYAMINMasahiro TanakaPhil HughesNYA-15014559.60
7/8/2014NYACLEMasahiro TanakaTrevor BauerNYA-14714259.10

Friday, July 04, 2014

Top 10 Biggest Road Favorites

In today's Dodgers vs Rockies game the Dodgers are a 65.6% favorite to win on the road. This is the largest road favorite of the year so far this season. Of course it is a game that a red-hot Clayton Kershaw is pitching in and the Rockies Jair Jurrjens isn't exactly the leagues best pitcher. This got me to thinking what the top ten list would look like for largest road favorites this year.  Kershaw and Strasburg appear twice on this list.

Here is the list

DateAwayHomeAway SPHome SPVegas FaveML FaveML DogVegas Win Exp
7/4/2014LANCOLClayton KershawJair JurrjensLAN-19818465.6%
5/2/2014SEAHOUFelix HernandezBrad PeacockSEA-18517564.3%
6/11/2014NYASEAMasahiro TanakaChris YoungNYA-18517564.3%
4/15/2014WASMIAStephen StrasburgToby KoehlerWAS-17817163.6%
5/23/2014LANPHIClayton KershawRoberto HernandezLAN-17216562.8%
4/25/2014OAKHOUJesse ChavezBrad PeacockOAK-17116162.4%
3/31/2014WASNYNStephen StrasburgDillon GeeWAS-17215862.3%
5/3/2014STLCHNMichael WachaJake ArrietaSTL-16716062.0%
4/22/2014STLNYNAdam WainwrightDillon GeeSTL-16815862.0%
4/30/2014DETCHAMax ScherzerHector NoesiDET-16815862.0%

Thursday, July 03, 2014

Park Factor Surprises

Nobody likes surprises, right? Unless it is your birthday and then maybe you do. But when it comes to park factors (runs) it is often difficult to nail down a teams park factor and randomness plays havoc with what smart people think the park factors should be. As you know, I keep track of the runs scored portion of a teams park factor along with a Vegas park factor that I reverse engineer from each teams over/under, where I replace the actual runs scored in each game with the Vegas over/under total. This gives me another aspect of the park factor. The aspect of the wisdom of the crowd of the people who are actually risking their hard earned money on knowing how many runs scored each game is likely to have. I love comparing things like over/unders, expected win totals and player projections to the people who risk their money on each game. What I have listed in the table below is each teams current 2014 park factor for runs scored along with their Vegas park factor. The table is sorted by the most similar park factors with the biggest surprises at the bottom of the table. Enjoy!

TeamActual PFVegas PF2014 Delta
Blue Jays1.0881.0740.0136
White Sox1.0100.9900.0206
Red Sox0.9801.0550.0753

MLB Over/Unders And The Empirical Data

In my previous post I used my simulator to come up with a set of equations to convert an MLB Over/Under to an average runs scored per game number. Basically, a conversion tool to go from the median to mean for runs scored in a game. In this post I am going to show what the actual empirical data looks like based off of the 1266 games played so far. Obviously, the sample size here will be problematic. The next step will be to add data from previous seasons to the data that I have for the current 2014 season. I may or may not be able to do this but here is the 2014 data nonetheless. And keep in mind this data is not taking into account the odds or percentage chance of the game going over or under. It is assuming that all games have a 50/50 chance of going over or under, which is wrong but it should even out a little bit.

Over/UnderCountAverage RPG

As you can see the sample size problem makes this data pretty close to unusable. And that is part of what I am trying to show here. What I would expect to see in the "Average Runs Per Game" column of the table had the sample size been in the tens of thousands is a number about 0.45 higher than the over/under number. Our largest sample size is the over/under of 7.5 and the average runs scored per game is 0.59 higher than the over/under.

Monday, June 30, 2014

Average Runs Scored Given Vegas Over/Under Odds

When you look at the Over/Under, often referred to as the "Run Total" for a major league baseball game at a Sports Book you will see the run total given with a number like "7 runs" with juice looking something like -120/100 with the -120 being the pay out for the over and the +100 being the pay out for the under. Juice looking like -120/100 is telling you that the Sports Book thinks it is a little bit more likely that the game will go over than under. In fact, the Sports Book is telling you there is a 52.38% chance that the game goes over and a 47.62% chance that the game goes under. Here is my algorithm and calculator showing you how to convert from a Sports Book odds (Example: -120/100) to a percentage.

How about a game where the Sports Book thinks there is a 50/50 chance of the game going over or under (-110/-110)? If the run total was "7 runs" on such a game how many runs would you expect there to be scored if this game was played thousands of times? You might think the answer would be 7, but it is not. Seven runs would be the median or the the run total where you would have the same number of overs as unders. But what about the mean or the average number of runs scored per game. Since run totals are skewed, such that the most likely final score for almost any game with a 7 run over/under is the home team winning by a score of 3-2 (5 total runs) we see a mean that is different than the median. How do you calculate the mean?

It's not easy to calculate, the best way is to look at the empirical data. Look at games and track the run total, over juice and under juice and see what the average number of runs scored for each game with the same values for each of the three parameters. Quickly, the problem you run in to is a sample size problem. There are just not enough games out there (162 per year). So this won't work very well. The solution is to create a larger sample size and the way I did this was to use my simulator to create games with an average of 5.5 up to 11.5 runs with gaps of 0.05 runs per game. For example I created two teams that averaged 5.5 runs per game when playing each other a million times. I then adjusted the two teams to create an outcome that averaged 5.55 runs per game, all the while recording the percentage that this game went over or under the nearest run total.

For example, I created a game and simulated it one million times that outputted an average runs scored per game of 5.9902. The run total that was closest to 50% on the over/under for this game was 5-1/2 runs. The chances of this game going over was 48.64% and going under was 51.36%.

Once I get enough samples at each over/under I can get a best fit equation (y = mx + b) for each run total given that I know the chances that the game goes over and under. My simulator tells me this and in the Sports Book example the over/under odds tells me this. So once I have the equation built from the simulators empirical data, I can use those equations with the Sports Book odds once I calculate the over and under chances from the odds and juice.

So below is the table that shows you the equation for each Run Total. In the equation "x" is the percent chance (ie - 51.92) that the games goes "over".

Let's take the June 30th game between the Indians and Dodgers as an example. The Vegas Odds on the "Run Total" look like 7-1/2 +115/-125 which translates to an over chance of 45.45% and an under chance of 54.55%.

The equation for a game with a Run Total of 7-1/2 is: y = (0.087176)(45.45) - 3.87196653

Which tells us the average number of runs for this game (given that the Vegas Odds are true odds) is... 7.59 runs

An interesting side note is that let's say you have a Run Total of 7-1/2 runs with Vegas giving us a 50/50 chance of both the over and under hitting, that would give us an average run total of 7.99 runs.

1. Get Vegas Run Total
2. Use the table below to determine your slope(m) and offset(b)
3. Use Vegas odds on the Run Total to determine percent chance the game goes over(x)
4. Calculate average runs scored per game by running data through the equation y = mx + b

Equation To Calculate Average Runs Scored per Game

Run TotalSlope(m)Offset(b)

Monday, June 09, 2014

Vegas MLB Over/Under Recap

Here is a breakdown on how many times Vegas has set the over/under at each number during the baseball season so far. The breakdown also shows how many times the over, under or a push hit for each Vegas over/under. Though it still is a small sample size, more unders are hitting for the higher over/unders and more overs are hitting for the lower over/unders. I am not trying to claim any great revelations here, just reporting what the empirical data looks like so far.


Saturday, May 17, 2014

Top 20 Games With The Largest Favorites

I found the Top 20 games that had the largest Vegas favorite this season as I was curious to see which teams and or pitchers frequented the list. The Astros(9), Cubs(4) and Twins(3) were the teams that showed up the most on the losing side with the Tigers(6), Athletics(3) and Cardinals(3) showing up the most on the favored side. No pitcher on the favored side shows up more than twice (Verlander, Tanaka, Scherzer, Wainwright). A win expectancy of 73.28% was the largest favorite we have seen so far when Jarred Cossart and the Astros lost to Justin Verlander and the Tigers by the score of 2-0. In these 20 games (SSS) the favorites did very well. You'd expect them to win around 13 or 14 of the 20 games but they won 16 of them for an 80% winning percentage. Of course 20 games is a tiny sample so there is nothing out of the ordinary for winning 16 out of these 20 games. Anyways, most of the fun is just in the list... and here it is.

DateAwayHomeAway SPHome SPVegas FaveML FaveML DogVegas Win ExpResult
5/5/2014HOUDETJarred CosartMax ScherzerDET-27025472.38DET 2-0
4/21/2014HOUSEADallas KeuchelFelix HernandezSEA-26224771.79HOU 7-2
4/20/2014HOUOAKBrad PeacockJesse ChavezOAK-25523571.01OAK 4-1
4/22/2014CHADETChris LeesmanJustin VerlanderDET-25023070.59DET 8-6
5/9/2014MINDETPhil HughesJustin VerlanderDET-23522569.70MIN 2-1
5/7/2014HOUDETBrad PeacockRick PorcelloDET-23022069.23DET 3-2
5/10/2014MINDETKyle GibsonMax ScherzerDET-22821869.04DET 9-3
4/11/2014HOUTEXScott FeldmanYu DarvishTEX-22521568.75TEX 1-0
5/13/2014CHNSTLJake ArrietaAdam WainwrightSTL-23021068.75STL 4-3
4/10/2014MIAWASTom KoehlerStephen StrasburgWAS-21620667.85WAS 7-1
4/10/2014HOUTORDallas KeuchelR.A. DickeyTOR-21520567.74HOU 6-4
4/13/2014CHNSTLEdwin JacksonMichael WachaSTL-21520567.74STL 6-4
4/12/2014CHNSTLCarlos VillanuevaAdam WainwrightSTL-21320367.53STL 10-4
4/19/2014HOUOAKBrad OberholtzerScott KazmirOAK-21519567.21OAK 4-3
4/18/2014HOUOAKJarred CosartSonny GrayOAK-20019066.10OAK 11-3
4/22/2014MINTBKyle GibsonDavid PriceTB-20318766.10TB 7-3
4/9/2014MIAWASBrad HandJordan ZimmermannWAS-20218565.93WAS 10-7
4/16/2014CHNNYAJason HammelMasahiro TanakaNYA-20018565.81NYA 3-0
5/8/2014HOUDETDallas KeuchelDrew SmylyDET-19718865.81HOU 6-2
5/3/2014TBNYAJake OdorizziMasahiro TanakaNYA-19518765.64NYA 9-3

Saturday's MLB Vegas Numbers

AwayHomeAway PitcherHome PitcerFaveMLMLWin %O/UOver VigUnder VigExp Runs
LANARIClayton KershawChase AndersonLAN-15815360.868105-1157.76
NYNWASBartolo ColonGio GonzalazWAS-15014559.607.5117-1277.10
TBLAACarlos RamosC.J. WilsonLAA-14313858.428.5100-1108.33
MIASFTom KoehlerTim LincecumSF-14113658.077.5112-1227.16
OAKCLEScott KazmirJosh TomlinOAK-13513056.997.5-1131037.52
SDCOLRobbie ErlinJordan LylesCOL-13312856.629.5-108-1029.44
PITNYAEdinson VolquezDavid PhelpsNYA-12712255.469105-1158.76
BALKCBud NorrisDanny DuffyKC-12411954.857.5100-1107.33
DETBOSRick PorcelloJohn LackeyBOS-12311854.658.5107-1178.23
CINPHIHomer BaileyCole HamelsPHI-11911453.817.5-105-1057.40
ATLSTLAaron HarangShelby MillerSTL-11811353.607-1151057.04
MILCHNMatt GarzaEdwin JacksonMIL-11711253.387.5-115-1057.47
CHAHOUHector NoesiJarred CosartHOU-11511052.949112-1228.66
SEAMINRoenis EliasSamuel DedunoMIN-11010551.818104-1147.77
TORTEXMark BuehrleRobbie RossTEX-105-10550.009-115-1058.97

Notes: Table sorted by largest favorite

Friday, May 16, 2014

What Does The Leverage Index Look Like

Leverage index was a statistic invented by Tom Tango that measures the importance or pressure of a situation in a baseball game. An average leverage index is 1.0 and anything higher than that indicates that the current state is an above average pressure situation. I used my simulator to determine what the average leverage index was for when there were 0, 1 and 2 outs in any inning of a game. I then used the simulator to also determine the average leverage index for each half inning of a game. The results are below. For these simulations I used a few random games so results could be slightly different running other games but the trends should be similar.  The results are that you generally see higher leverage situations the lower the number of outs are and you also tend to see higher leverage situations later in the games.  Those conclusions may be obvious but at least you can get a visual image of it.  Five millions games were simulated.

Table 1
Average Leverage Index Based on Outs State
OutsAverage LI

Table 2
Average Leverage Index Based on Half Inning of Game (0=Top of first, 1=Bottom of first etc...)
InningAverage LI

Graph of Table 2


Friday's MLB Vegas Numbers

AwayHomeAway PitcherHome PitcherFaveMLMLWin %O/UOver VigUnder VigExp Runs
TORTEXDrew HutchisonYu DarvishTEX-16615961.908.5100-1108.33
SDCOLEric StultsJorge de la RosaCOL-15314860.0810.5-105-10510.40
LANARIZack GreinkeWade MileyLAN-13812857.088.5-1051158.54
TBLAAChris ArcherJered WeaverLAA-13513056.998-1101007.97
MIASFHenderson AlvarezYusmeiro PetitSF-13612656.717.5105-1257.19
NYNWASJonathan NieseTanner RoarkWAS-13112656.247-105-1056.90
OAKCLESonny GrayZach McAllisterOAK-12712255.467.5105-1157.26
PITNYAEdinson VolquezDavid PhelpsNYA-12311854.658.5-1151058.54
MILCHNKyle LohseJeff SamardzijaCHN-11110652.046.5-110-1106.40
CINPHIAlfredo SimonKyle KendrickCIN-11010551.818-1161068.06
ATLSTLErvin SantanaLance LynnSTL-10910451.577117-1276.60
CHAHOUJose QuintanaCollin McHughHOU-10810351.348-1201108.11
SEAMINChris YoungKyle GibsonMIN-10710251.108-1101007.97
DETBOSMax ScherzerJon LesterBOS-10510050.628110-1207.69
BALKCChris TillmanJeremy GuthrieBAL-104-10150.378102-1127.80

Notes: Table sorted by largest Vegas favorite.