## Tuesday, December 17, 2013

### How Does BABIP Effect Run Scoring

It is pretty obvious, the higher a teams batting average on balls in play (BABIP) is the more runs they will score.  But the million dollar question is what is the relationship between BABIP and runs scored.  How many more or less runs can a team expect to score based on an increase or decrease in their BABIP.

When I posed this question to subject matter expert Tom Tango, he gave me the following answer.
You get +.75 runs for turning a sure out into a sure hit.

If you change BABIP from .300 to .301, you will get an extra .001 x .75 runs per ball in play.

If you assume that 70% of PA are balls in play, then changing BABIP from .300 to .301, you will get an extra .70 x .001 x .75 runs per PA.

If you have say 38 PA per game, then changing BABIP from .300 to .301 will get you an extra 38 x .70 x .001 x .75 runs per game.

So, 1 point in BABIP is .02 runs per game.

Naturally, this only works at very modest changes. If you go from .300 to .400, well, that 38 PA won’t hold. On top of which, you have compounding effects, so runs are not linear any more.
*****     *****     *****     *****     *****
My intention all along was to use my simulator to figure this out but now I had a baseline to compare my results against.  Would the simulator come up with something close to the "1 point in BABIP is .02 runs per game"?

Where the power of the simulator comes in, it allows you to pick and choose your run environment and to change the BABIP of all pitcher/hitter matchups to any value all the while leaving all other variables the same.  Maybe the 0.02 runs per game only holds for a certain BABIP value?  By using BABIP numbers all the way from 0.000 to 1.000 the simulator should be able to show what kind of relationship BABIP and runs scores has on a basic x/y-line graph.  It can also zero in on specific ranges of BABIP that are more common in the major leagues.

Methodology
I don't want to overload this post with all the boring details (tldr) so I will give you the basics.  I created two teams Team A(way) and Team (H)ome making the teams fairly even and making their run environment at right around 8.2 combined runs (Away team = 4.4 rpg, 0.300 BABIP).  Since the away team bats in the 9th inning every game, I used them as the guinea pigs.  I hard-coded every single pitcher/hitter matchup for their team to have the same BABIP no matter what.  All other variables were held the same.  I would simulate 2.5 million games with the away team having a BABIP of 0.300 in one trial and then turn around and simulate 2.5 million games with the away team having a BABIP of 0.301 etc... then look at the results and see how the change in BABIP effected the total runs scored of the away team.  Now, I didn't simulate every single BABIP from 0.000 to 1.000 but I did simulate every BABIP from 0.300 to 0.340 and many of the points in between there and 0.000 and 1.000 in order to get a good graph of the relationship.

The graph above shows the runs scored for the Away team on the y-axis and their BABIP on the x-axis. This graph gives you a good look at how the run totals change for all values of team BABIP from 0.000 to 1.000. When looking at the entire BABIP spectrum the plot looks non-linear.

Next up (below) is a graph showing the same thing but zooming in on the more common BABIP range (from 0.290 to 0.350) and as you can tell the plot now becomes linear for all practical purposes.

Now let's take a look at which BABIP total Tom Tango's 0.02 run/game for a 0.001 of BABIP comes in at. The plot below gives you a pretty good idea.

You can tell from the plot that the 0.02 (run per game, for 1 point of BABIP) is somewhere in between 0.326 and 0.336. Anything below this range and you are looking at a number less than 0.02 for what 1 point of BABIP is worth and anything greater than 0.336 you are looking at a number greater than 0.02 for what 1 point of BABIP is worth.  This graph does have some noise in it, but you can still get a good idea of the trend.

So there is no one right answer without knowing the run environment you are in and what original BABIP you are using as a baseline.  If you use a run environment of around 4.4 runs per game (for the Away team) and a BABIP of 0.300 then one point (0.001) of BABIP is worth 0.0175 runs per game.  You don't see the 0.02 value until you raise the BABIP to over 0.326.

For the extremes you will see a runs/game value of around 0.01 when the BABIP is pegged at 0.150.  A BABIP of 0.400 will make one extra point of BABIP worth 0.025 runs per game.  A BABIP of 0.900 will make one extra point of BABIP worth 0.07 runs per game.

When you get to the extremes the type of hitters and pitchers you have plays a bigger role in what a point of BABIP is worth.  When you use a very small BABIP number, hitters who hit a lot of HRs become more important to offense as almost any ball put into play will become an out.  The defense will want a pitcher who does not have a tendancy to give up HRs.  When you use a very large BABIP number, hitters who do not strike-out often become very valuable as not many outs are made on balls in play and of course the defense will want a pitcher who strikes out a lot of hitters.

And finally, here is a table showing how often the Away team won the game based on what their BABIP was pegged to.

BABIPAway RunsWin %
0.0001.388116.29%
0.1001.951924.59%
0.2002.911137.45%
0.3004.395954.38%
0.4006.547872.20%
0.5009.517886.66%
0.60013.457895.28%
0.70018.529598.8766%
0.80024.606599.8373%
0.90031.356899.9871%
1.00038.876999.9997%