Tuesday, December 31, 2013

Best Lineup - Tampa Bay Rays


Next up on my look at each teams most efficient lineup is the Tampa Bay Rays.  In this exercise the methodology is to use my simulator to find out which lineup wins the most games vs RH and LH pitchers.  I do this by making the team of interest the "away" team, playing against a "make believe" team whose stats don't change from one sim to the next. In fact no stats (or input projections) change for either team, the only difference from one simulation to the next is the lineup of the team of interest. For player projections, I am using Steamer projections which are available on Fangraphs. The lineup results will only be as good as the projections.  I am not a subject matter expert on every teams personnel but I try to use MLBDepthcharts as a guidance as to which players are starters and I tend to avoid hitting too many LH back to back when reasonably possible.  Keep in mind, the results are not intended to match what a certain teams manager is most likely to do during the season.

Previous teams:
AL: Angels | Rangers
NL: Mets | Cubs

See the results after the jump...

Monday, December 30, 2013

Best Lineup - Chicago Cubs


Next up on my look at each teams most efficient lineup is the Chicago Cubs.  In this exercise the methodology is to use my simulator to find out which lineup wins the most games vs RH and LH pitchers.  I do this by making the team of interest the "away" team, playing against a "make believe" team whose stats don't change from one sim to the next. In fact no stats (or input projections) change for either team, the only difference from one simulation to the next is the lineup of the team of interest. For player projections, I am using Steamer projections which are available on Fangraphs. The lineup results will only be as good as the projections.  I am not a subject matter expert on every teams personnel but I try to use MLBDepthcharts as a guidance as to which players are starters and I tend to avoid hitting LH back to back when reasonably possible.  Keep in mind, the results are not intended to match what a certain teams manager is most likely to do during the season.

Previous teams:
AL: Angels | Rangers
NL: Mets

See the results after the break.

Sunday, December 29, 2013

Best Lineup - Texas Rangers


Next up on my look at most efficient lineups is the Texas Rangers.  I used my baseball simulator to run millions of games through various different possible lineup scenarios to see which lineup it spit out as the most likely to win a game vs a RH and LH pitcher.  Each lineup was simulated in 2.5 million games.

Please keep in mind that 2014 Steamer Projections were used as input, so if you don't like some of the results take it up with them.

Previous teams:
AL: Angels
NL: Mets

See the results after the break.

Friday, December 27, 2013

Best Lineup - New York Mets


Next up on my look at most efficient lineups is the New York Mets.  I used my baseball simulator to run millions of games through various different possible lineup scenarios to see which lineup it spit out as the most likely to win a game vs a RH and LH pitcher.  I tried my best to not stack left handed hitters and I always batted the pitcher 9th because no MLB manager will bat his pitcher 8th which is where most should hit.

Please keep in mind that 2014 Steamer Projections were used as input, so if you don't like some of the results take it up with them.

Previous teams:
AL: Angels
NL: None

See the results after the break.

Monday, December 23, 2013

Best Lineup - Los Angeles Angels


Not sure if this is going to be a series for all teams or just some teams, but I am going to kick things off with the best lineup for the Los Angeles Angels. The methodology is to use my simulator to find out which lineup wins the most games vs RH and LH pitchers. I do this by making the team of interest the "away" team, playing against a "make believe" team whose stats don't change from one sim to the next. In fact no stats (or input projections) change for either team, the only difference from one simulation to the next is the lineup of the team of interest. For player projections, I am using Steamer projections which are available on Fangraphs. The lineup results will only be as good as the projections.  I am not a subject matter expert on every teams personnel but I try to use MLBDepthcharts as a guidance as to which players are starters and I tend to avoid hitting LH back to back when reasonably possible.  Two million simulations make up the sample size.

See the results after the break.

Wednesday, December 18, 2013

Battle Of The Gold Gloves


One of the benefits of having a program that can accurately simulate a baseball game is that you can pretty much model anything and you can use the law of large numbers (or samples) to do the dirty work for you.  In my latest exercise, I decided to take the 2013 Gold Glove winners from both the National and American leagues and have them play against each other.  In order to make it fair, I ran sets of the simulation with each team being away/home and facing both a LH and RH starting pitcher.  I gave both teams the exact identical starting pitcher, bench and bullpen so that the only difference were the starting players.  I played by NL rules with no DH and gave both teams the same hitting skill for their pitcher.  And afterwards, I did the same thing but this time made all the players league average fielders to see which side was better solely on offense.

The simulator also allows me to determine the most efficient lineup for both teams (facing RHP and LHP).  The lineups that you see for both teams were the highest scoring lineups according to the simulator.  I put in a limitation of not batting any left handed hitters back to back as this seems to be something that most MLB managers follow and I always batted the pitcher ninth.

Here are the lineups

vs RHPGGNLGGALvs LHPGGNLGGAL
1G.ParraD.Pedroia1G.ParraD.Pedroia
2Y.MolinaA.Gordon2Y.MolinaS.Victorino
3P.GoldschmidtS.Victorino3P.GoldschmidtE.Hosmer
4C.GonzalezE.Hosmer4C.GonzalezA.Jones
5N.ArenadoA.Jones5N.ArenadoA.Gordon
6C.GomezS.Perez6A.SimmonsS.Perez
7A.SimmonsM.Machado7B.PhillipsM.Machado
8B.PhillipsJ.Hardy8C.GomezJ.Hardy
9PitcherPitcher9PitcherPitcher
(GGNL - Gold Glove NL, GGAL - Gold Glove AL)

And here are the results

This table has all the players set to their defensive values.
DescriptionAwayHomeWinnerAway RSHome RSWin %Total Runs
vs RHPGGNLGGALGGAL3.433.3250.426.75
vs RHPGGALGGNLGGNL3.143.6157.726.75
vs LHPGGNLGGALGGAL3.483.3650.376.84
vs LHPGGALGGNLGGNL3.163.6758.136.83

... and this table has all the players set to league average defensive values.
DescriptionAwayHomeWinnerAway RSHome RSWin %Total Runs
vs RHPGGNLGGALGGAL3.773.8052.197.57
vs RHPGGALGGNLGGNL3.603.9756.337.57
vs LHPGGNLGGALGGAL3.823.8452.107.66
vs LHPGGALGGNLGGNL3.634.0456.697.67

Back Napkin Analysis:
It looks like the National League team is better both defensively and offensively.  Now keep in mind that the results will reflect the input data or player projections both offensively and defensively.  Not wanting to be biased, I used 2014 Steamer projections for the offense and I eye-balled the defensive values for each player from a mixture of UZR, FSR and Zips (if available).  I tended not to go above 15 runs saved per 150 games for any player.  Below are the defensive numbers I used for each player.

NLAL
CY.Molina (15)S.Perez (13)
1BP.Goldschmidt (4)E.Hosmer (3)
2BB.Phillips (8)D.Pedroia (10)
3BN.Arenado (12)M.Machado (15)
SSA.Simmons (15)J.Hardy (10)
LFC.Gonzalez (10)A.Gordon (7)
CFC.Gomez (13)A.Jones (-3)
RFG.Parra (12)S.Victorino (15)

Tuesday, December 17, 2013

How Does BABIP Effect Run Scoring


It is pretty obvious, the higher a teams batting average on balls in play (BABIP) is the more runs they will score.  But the million dollar question is what is the relationship between BABIP and runs scored.  How many more or less runs can a team expect to score based on an increase or decrease in their BABIP.

When I posed this question to subject matter expert Tom Tango, he gave me the following answer.
You get +.75 runs for turning a sure out into a sure hit.

If you change BABIP from .300 to .301, you will get an extra .001 x .75 runs per ball in play.

If you assume that 70% of PA are balls in play, then changing BABIP from .300 to .301, you will get an extra .70 x .001 x .75 runs per PA.

If you have say 38 PA per game, then changing BABIP from .300 to .301 will get you an extra 38 x .70 x .001 x .75 runs per game.

So, 1 point in BABIP is .02 runs per game.

Naturally, this only works at very modest changes. If you go from .300 to .400, well, that 38 PA won’t hold. On top of which, you have compounding effects, so runs are not linear any more.
          *****     *****     *****     *****     *****
My intention all along was to use my simulator to figure this out but now I had a baseline to compare my results against.  Would the simulator come up with something close to the "1 point in BABIP is .02 runs per game"?

Where the power of the simulator comes in, it allows you to pick and choose your run environment and to change the BABIP of all pitcher/hitter matchups to any value all the while leaving all other variables the same.  Maybe the 0.02 runs per game only holds for a certain BABIP value?  By using BABIP numbers all the way from 0.000 to 1.000 the simulator should be able to show what kind of relationship BABIP and runs scores has on a basic x/y-line graph.  It can also zero in on specific ranges of BABIP that are more common in the major leagues.

Methodology
I don't want to overload this post with all the boring details (tldr) so I will give you the basics.  I created two teams Team A(way) and Team (H)ome making the teams fairly even and making their run environment at right around 8.2 combined runs (Away team = 4.4 rpg, 0.300 BABIP).  Since the away team bats in the 9th inning every game, I used them as the guinea pigs.  I hard-coded every single pitcher/hitter matchup for their team to have the same BABIP no matter what.  All other variables were held the same.  I would simulate 2.5 million games with the away team having a BABIP of 0.300 in one trial and then turn around and simulate 2.5 million games with the away team having a BABIP of 0.301 etc... then look at the results and see how the change in BABIP effected the total runs scored of the away team.  Now, I didn't simulate every single BABIP from 0.000 to 1.000 but I did simulate every BABIP from 0.300 to 0.340 and many of the points in between there and 0.000 and 1.000 in order to get a good graph of the relationship.


The graph above shows the runs scored for the Away team on the y-axis and their BABIP on the x-axis. This graph gives you a good look at how the run totals change for all values of team BABIP from 0.000 to 1.000. When looking at the entire BABIP spectrum the plot looks non-linear.

Next up (below) is a graph showing the same thing but zooming in on the more common BABIP range (from 0.290 to 0.350) and as you can tell the plot now becomes linear for all practical purposes.


Now let's take a look at which BABIP total Tom Tango's 0.02 run/game for a 0.001 of BABIP comes in at. The plot below gives you a pretty good idea.


You can tell from the plot that the 0.02 (run per game, for 1 point of BABIP) is somewhere in between 0.326 and 0.336. Anything below this range and you are looking at a number less than 0.02 for what 1 point of BABIP is worth and anything greater than 0.336 you are looking at a number greater than 0.02 for what 1 point of BABIP is worth.  This graph does have some noise in it, but you can still get a good idea of the trend.

So there is no one right answer without knowing the run environment you are in and what original BABIP you are using as a baseline.  If you use a run environment of around 4.4 runs per game (for the Away team) and a BABIP of 0.300 then one point (0.001) of BABIP is worth 0.0175 runs per game.  You don't see the 0.02 value until you raise the BABIP to over 0.326.

For the extremes you will see a runs/game value of around 0.01 when the BABIP is pegged at 0.150.  A BABIP of 0.400 will make one extra point of BABIP worth 0.025 runs per game.  A BABIP of 0.900 will make one extra point of BABIP worth 0.07 runs per game.

When you get to the extremes the type of hitters and pitchers you have plays a bigger role in what a point of BABIP is worth.  When you use a very small BABIP number, hitters who hit a lot of HRs become more important to offense as almost any ball put into play will become an out.  The defense will want a pitcher who does not have a tendancy to give up HRs.  When you use a very large BABIP number, hitters who do not strike-out often become very valuable as not many outs are made on balls in play and of course the defense will want a pitcher who strikes out a lot of hitters.

And finally, here is a table showing how often the Away team won the game based on what their BABIP was pegged to.

BABIPAway RunsWin %
0.0001.388116.29%
0.1001.951924.59%
0.2002.911137.45%
0.3004.395954.38%
0.4006.547872.20%
0.5009.517886.66%
0.60013.457895.28%
0.70018.529598.8766%
0.80024.606599.8373%
0.90031.356899.9871%
1.00038.876999.9997%