Wednesday, June 26, 2013

Introducing Win Probability Added Over Expectation OR WPAOE


This conversation over at The Book Blog about why Fangraphs, who posts live Win Expectancy does not take into consideration the actual players who are playing in the game got me thinking. Fangraphs uses a one sized fits all chart for things like Win Expectancy (WE), Win Probability Added (WPA) and Leverage Index (LI). The problem is you are getting the exact same WE for a 0-0 first inning game with the Astros (Dallas Keuchel) facing the Tigers in Detroit with Justin Verlander or Max Scherzer pitching as you would for two evenly pitted teams.  You are also getting incorrect WPA's for each game event.  These discrepancies last throughout the entire game. For those wanting to get a true in game Win Expectancy or Win Probability Added you aren't going to get one all the time.  To others it might not be a big deal as the one size fits all chart is somewhat reasonable.  But it is uninteresting when you can get close to the real thing. Why not shoot for the real thing, when it can be done? Why use two sticks to make fire when you have a lighter?

While I think it should be a goal to do better on the in game Win Expectancies, I can see some usefullness of the WPA as it tells you a story of what players earned what portions of a win or a loss.  I do however find it a little troubling that players are receiving too much or too little their fair share of WPA due to WPA not adjusting for strength/weakness of opponent.  If a game that should be a 75/25 WE game gets calculated as a 50/50 WE game then the stronger team is more likely than not going to be gaining 0.25 points of WPA that they shouldn't be getting.  That is why I am proposing a new variation of WPA, called Win Probability Added Over Expectation (WPAOE) in which a player will exceed expectation if he scores a WPAOE of over 0.0 (and less than 0.0 for not meeting expectation).  WPAOE will use a "more correct" version of Win Expectancy to calculate how much each game event (out, hit, stolen base etc...) changes the contribution each player made to the win or loss.

So how can you calculate the true in game win expectancy?  Well, this isn't something the average person can do.  The pre-game win expectancy can easily be calculated from Vegas odds.  But how do you adjust this in game WE as the game goes on?  You basically need a "good" baseball game simulator.  Something that sites like Fangraphs do not have.  Even if they had one, you would run in to the problem of how quickly could it compute a new in game Win Expectancy after every game event and for multiple games.  So there are some difficult technical hurdles, but I am laying the groundwork for how this could and should eventually be done.

In the example below, I took the 2013 game with the highest pre-game win expectancy and went through and calculated all the win expectancies and WPAOEs after each game event as an example of how I imagine this to look somewhere down the road.  As you see, the game starts out with the Tigers as a 78.23% favorite (SIM WE = Tigers WE), while Fangraphs starts the Tigers WE at 50.0%.  So in a nutshell, what I did here is simulate this game 1 million times at every new game state.  All win expectancies (FG, Sim) are listed with the Tigers percent chance of winning.  And yes, my simulator accounts for pretty much everything meaningful you can think of.

PitcherPlayerInnOutsBaseHOU RunsDET RunsPlayFG WESIM WEFG WPASIM WPA
NANA-00005078.2300
M.ScherzerR.GrossmanT10000BB46.575.930.0350.023
M.ScherzerJ.ElmoreT10100KS49.878.13-0.033-0.022
M.ScherzerR.GrossmanT11100CS53.880.48-0.04-0.0235
M.ScherzerC.PenaT12000F-854.881.26-0.01-0.0078
D.KeuchelO.InfanteB10000KS52.679.72-0.022-0.0154
D.KeuchelT.HunterB11000F-95178.29-0.016-0.0143
D.KeuchelM.CabreraB12000F-95076.97-0.01-0.0132
M.ScherzerC.CarterT20000F-352.478.11-0.024-0.0114
M.ScherzerJ.MartinezT21000F-95478.93-0.016-0.0082
M.ScherzerC.CorporanT21000HR43.168.210.1090.1072
M.ScherzerJ.ParedesT22010KS44.168.68-0.01-0.0047
D.KeuchelP.FielderB200104-341.666.49-0.025-0.0219
D.KeuchelV.MartinezB210106-339.864.81-0.018-0.0168
D.KeuchelJ.PeraltaB22010BB41.266.040.0140.0123
D.KeuchelM.TuiasosopoB221101B44.169.10.0290.0306
D.KeuchelB.PenaB225111B54.579.420.1040.1032
D.KeuchelA.GarciaB22314HR80.395.610.2580.1619
D.KeuchelO.InfanteB220146-379.795.43-0.006-0.0018
M.ScherzerM.DominguezT30014KS81.995.99-0.022-0.0056
M.ScherzerM.GonzalezT310144-383.396.31-0.014-0.0032
M.ScherzerR.GrossmanT32014KC84.296.51-0.009-0.002
D.KeuchelT.HunterB30014F-98396.16-0.012-0.0035
D.KeuchelM.CabreraB310141B84.396.360.0130.002
D.KeuchelP.FielderB311144-6-381.795.48-0.026-0.0088
M.ScherzerJ.ElmoreT400142B7693.780.0570.017
M.ScherzerJ.ElmoreT40214Balk73.693.280.0240.005
M.ScherzerC.PenaT404241B68.889.810.0480.0347
M.ScherzerC.CarterT40124BB61.886.080.070.0373
M.ScherzerJ.MartinezT40354HR37.559.190.2430.2689
M.ScherzerC.CorporanT40054F-739.860.54-0.023-0.0135
M.ScherzerJ.ParedesT410542B35.657.170.0420.0337
M.ScherzerJ.ParedesT41254CS41.561.46-0.059-0.0429
M.ScherzerM.DominguezT42054F-542.662.2-0.011-0.0074
D.KeuchelV.MartinezB40054KS39.659.57-0.03-0.0263
D.KeuchelJ.PeraltaB41054BB42.962.060.0330.0249
D.KeuchelM.TuiasosopoB411541B47.866.370.0490.0431
D.KeuchelB.PenaB413545-4-33655.82-0.118-0.1055
M.ScherzerM.GonzalezT500543-138.457.18-0.024-0.0136
M.ScherzerR.GrossmanT51054F-940.158.28-0.017-0.011
M.ScherzerJ.ElmoreT52054KS41.358.94-0.012-0.0066
D.KeuchelA.GarciaB50054KS37.955.28-0.034-0.0366
D.KeuchelO.InfanteB51054F-835.552.14-0.024-0.0314
D.KeuchelT.HunterB52054F-833.850-0.017-0.0214
M.ScherzerC.PenaT60054KC36.351.51-0.025-0.0151
M.ScherzerC.CarterT61054KS38.152.72-0.018-0.0121
M.ScherzerJ.MartinezT620546-339.453.43-0.013-0.0071
D.KeuchelM.CabreraB60054F-835.452.57-0.04-0.0086
D.KeuchelP.FielderB610544-332.556.18-0.0290.0361
D.KeuchelV.MartinezB62054E-634.753.270.022-0.0291
D.KeuchelJ.PeraltaB621541B38.252.010.035-0.0126
D.KeuchelM.TuiasosopoB623551B56.475.10.1820.2309
T.BlackleyB.PenaB62355F-45069.53-0.064-0.0557
M.ScherzerC.CorporanT70055BB44.165.450.0590.0408
M.ScherzerJ.ParedesT70155FC49.769.18-0.056-0.0373
M.ScherzerM.DominguezT71155F-854.671.75-0.049-0.0257
M.ScherzerJ.ParedesT72155CS58.873.51-0.042-0.0176
T.BlackleyA.GarciaB70055F-554.972.2-0.039-0.0131
E.GonzalezO.InfanteB710551B5974.950.0410.0275
E.GonzalezO.InfanteB71155Pick Off52.169.34-0.069-0.0561
E.GonzalezT.HunterB72055BB54.270.480.0210.0114
E.GonzalezM.CabreraB72155KS5065.44-0.042-0.0504
D.SmylyM.GonzalezT80055KS54.767.57-0.047-0.0213
D.SmylyR.GrossmanT810551B49.865.640.0490.0193
D.SmylyJ.ElmoreT811551B43.159.670.0670.0597
D.SmylyC.PenaT81355KC51.765.57-0.086-0.059
A.AlburquerqueR.GrossmanT82355WP48.864.30.0290.0127
A.AlburquerqueC.CarterT82655KS60.773.25-0.119-0.0895
W.WrightP.FielderB80055KS56.168.95-0.046-0.043
W.WrightV.MartinezB81055F-852.665.31-0.035-0.0364
H.AmbrizJ.PeraltaB82055KS5063.43-0.026-0.0188
A.AlburquerqueJ.MartinezT90055BB41.861.480.0820.0195
P.CokeC.CorporanT901652B11.613.430.3020.4805
P.CokeJ.ParedesT90265SAC 5-311.913.29-0.0030.0014
P.CokeM.DominguezT91475SAC F-98.89.160.0310.0413
P.CokeM.GonzalezT920751-39.211.67-0.004-0.0251
J.VerasD.KellyB900753-14.55.75-0.047-0.0592
J.VerasB.PenaB91075BB10.713.430.0620.0768
J.VerasA.DirksB91175F-64.55.73-0.062-0.077
J.VerasB.PenaB92175DI4.86.020.0030.0029
J.VerasO.InfanteB92275BB9.111.870.0430.0585
J.VerasT.HunterB92375HBP17.527.730.0840.1586
J.VerasM.CabreraB92775F-800-0.175-0.2773

Note: My post over at The Book Blog on this subject.

I’m still not convinced we need to treat each game with a 50/50 starting point for WPA, especially when you have a game where a team is a lopsided 78% favorite to win that game.  Why not give the favored team 22% of positive WPA to play with.  If Kershaw starts the game with the Dodgers having a 78% chance of winning and leaves the game with the Dodgers having a 78% chance of winning then he has “done his job” WPA-wise.  Why should he and the rest of his team get extra added WPA just because they are playing against a bad team and they have a great pitcher starting for them?  This seems worse.

0.0 WPA should be the baseline of expected WPA for each player and giving a team that is a 78% favorite 22% of positive WPA to play with does just that.  If you start them out at 50/50 and don’t take into consideration the likelihood of one team winning over the other then you are padding their positive WPA with an extra of 28% of positive WPA points on average.  This just doesn’t seem right.  So what if the winning team only accumulates 0.22 points of WPA and the losing team accumulates -0.22 points of WPA hen the favorite wins.  You are incorrectly assigning WPA to individual players by assuming every game is a 50/50 game.

If the initial Win Expectancy was correct and everything played out perfectly with the season simulated billions of times you’d expect to see everyone’s WPA converge to zero.  But the season is one sample size (162 games) and WPA is pretty much a “story” stat so what you will see are players who exceeded expecations with a WPA over zero and players who did not exceed expecations with a WPA of under zero and players who met expectations with a WPA of right around zero.

Maybe a new stat?  Win Probability Added Over Expectation (WPAOE) ??

Tangotiger Response:
Everything you are saying is correct.  If Kershaw does his job exactly as intended, such that they had a 78% chance of winning entering the game and 78% chance of winning when he leaves, then he’s accumulated 0.0 wins IN-GAME.

No comments:

Post a Comment