Estimating utility of called pitches

Scott Spencer https://ssp3nc3r.github.io (Columbia University)https://www.columbia.edu
Last updated 2020 June 11

In baseball, teams value events that create runs scored for them and prevent scoring for the opposing team. Of course, having runners on base and avoiding outs helps a team score: i.e., the game state matters. So do the balls and strikes accruing in the pitcher-batter matchup. But for each, how much? A Bayesian model fitted with Stan to the last three years of data suggest that for takes we observe the following expected runs given a game state and count:

Gamestate
Count
—:- 0-0 0-1 0-2 1-0 1-1 1-2 2-0 2-1 2-2 3-0 3-1 3-2
No outs
—:0 0.54 0.50 0.44 0.59 0.54 0.47 0.66 0.61 0.54 0.77 0.74 0.69
1–:0 0.96 0.89 0.79 1.04 0.96 0.85 1.16 1.08 0.96 1.34 1.29 1.22
-2-:0 1.17 1.10 1.00 1.25 1.18 1.06 1.37 1.29 1.17 1.55 1.50 1.43
12-:0 1.54 1.47 1.35 1.65 1.55 1.42 1.79 1.69 1.55 2.01 1.95 1.86
–3:0 1.40 1.36 1.29 1.45 1.40 1.33 1.53 1.48 1.40 1.65 1.61 1.57
1-3:0 1.80 1.71 1.57 1.91 1.81 1.65 2.08 1.96 1.80 2.33 2.26 2.16
-23:0 1.93 1.86 1.75 2.02 1.94 1.82 2.15 2.06 1.94 2.34 2.29 2.21
123:0 2.28 2.19 2.03 2.40 2.29 2.12 2.58 2.46 2.29 2.84 2.77 2.67
One out
—:1 0.29 0.26 0.22 0.32 0.29 0.24 0.37 0.34 0.29 0.44 0.42 0.39
1–:1 0.56 0.51 0.43 0.62 0.56 0.48 0.71 0.65 0.56 0.84 0.81 0.75
-2-:1 0.71 0.66 0.58 0.77 0.71 0.63 0.86 0.80 0.71 1.00 0.96 0.91
12-:1 0.98 0.91 0.79 1.07 0.99 0.86 1.21 1.11 0.98 1.41 1.35 1.27
–3:1 0.97 0.89 0.77 1.07 0.98 0.85 1.21 1.11 0.97 1.42 1.36 1.28
1-3:1 1.21 1.13 0.98 1.33 1.22 1.07 1.50 1.38 1.22 1.75 1.68 1.58
-23:1 1.33 1.24 1.09 1.45 1.34 1.18 1.62 1.51 1.34 1.88 1.81 1.71
123:1 1.62 1.50 1.31 1.78 1.63 1.43 1.99 1.84 1.63 2.33 2.24 2.10
Two outs
—:2 0.11 0.10 0.07 0.13 0.11 0.09 0.16 0.14 0.11 0.20 0.19 0.17
1–:2 0.23 0.20 0.15 0.27 0.24 0.18 0.33 0.29 0.23 0.42 0.40 0.36
-2-:2 0.30 0.26 0.20 0.36 0.31 0.24 0.44 0.38 0.31 0.55 0.52 0.48
12-:2 0.45 0.39 0.29 0.53 0.46 0.35 0.65 0.57 0.46 0.82 0.77 0.70
–3:2 0.35 0.30 0.22 0.41 0.36 0.27 0.50 0.44 0.35 0.64 0.60 0.55
1-3:2 0.50 0.43 0.32 0.59 0.50 0.38 0.71 0.62 0.50 0.90 0.85 0.77
-23:2 0.52 0.45 0.33 0.62 0.53 0.40 0.75 0.66 0.53 0.95 0.89 0.81
123:2 0.82 0.71 0.53 0.97 0.83 0.64 1.18 1.03 0.83 1.49 1.41 1.28
Note:
Expectations from model fit to Statcast data, 2017-2019.

To estimate these expectations, I collected 2017-2019 data from MLB’s Statcast, including each pitch, the game state on that pitch, the number of balls and strikes before the pitch, whether the batter swung and, if not, the umpire’s call: ball or strike, whether the matchup resulted in a walk or strikeout and, finally, I calculated runs through the rest of the half inning.

It may seem intuitive to just count the frequency of runs through the half-inning within game-state-count strata. But some events are unreliably rare. Of 388,657 pitches in 2019 where the batter did not swing, within the 24 combinations of runners and outs, and 12 combinations of balls and strikes in a count, many of these strata have few observations:

Game State Balls Strikes Observations Frequency
–3:0 3 1 15 0.0000386
–3:0 3 0 19 0.0000489
1-3:0 3 1 21 0.0000540
123:0 3 2 21 0.0000540
123:0 3 0 25 0.0000643
1-3:0 3 0 26 0.0000669
1-3:0 3 2 27 0.0000695
123:0 3 1 28 0.0000720
–3:0 2 1 29 0.0000746
-23:0 3 1 31 0.0000798
–3:0 2 0 32 0.0000823
–3:0 3 2 33 0.0000849
-23:0 3 2 33 0.0000849
-23:0 3 0 36 0.0000926
–3:0 1 1 44 0.0001132
–3:0 0 2 45 0.0001158
123:0 2 1 45 0.0001158
123:0 2 0 46 0.0001184
123:0 2 2 51 0.0001312
123:1 3 0 51 0.0001312
–3:0 2 2 52 0.0001338
123:1 3 1 52 0.0001338
-23:0 2 0 54 0.0001389
-23:1 3 1 55 0.0001415
1-3:0 2 0 55 0.0001415
123:1 3 2 56 0.0001441
-23:0 2 1 58 0.0001492
1-3:0 2 2 59 0.0001518
–3:1 3 1 60 0.0001544
-23:0 0 2 61 0.0001570
1-3:0 2 1 63 0.0001621
–3:0 1 2 64 0.0001647
1-3:1 3 0 65 0.0001672
–3:0 0 1 66 0.0001698
1-3:1 3 1 66 0.0001698
-23:0 2 2 67 0.0001724
–3:0 1 0 69 0.0001775
1-3:1 3 2 74 0.0001904
123:0 0 2 78 0.0002007
123:0 1 2 80 0.0002058
12-:0 3 1 83 0.0002136
1-3:0 0 2 84 0.0002161
12-:0 3 0 87 0.0002238
-23:0 1 2 90 0.0002316
123:2 3 1 90 0.0002316
1-3:2 3 1 94 0.0002419
–3:1 3 2 96 0.0002470
–3:1 3 0 97 0.0002496
-23:0 1 1 97 0.0002496
-23:1 3 2 99 0.0002547
123:0 1 1 100 0.0002573
1-3:0 1 2 101 0.0002599
123:1 2 0 104 0.0002676
-23:2 3 1 105 0.0002702
123:2 3 2 109 0.0002805
1-3:0 1 1 112 0.0002882
123:1 2 1 112 0.0002882
1-3:2 3 0 114 0.0002933
–3:1 2 1 119 0.0003062
12-:0 3 2 120 0.0003088
123:2 3 0 120 0.0003088
-23:0 0 1 121 0.0003113
-2-:0 3 1 126 0.0003242
1-3:1 2 0 128 0.0003293
123:1 2 2 129 0.0003319
-23:1 0 2 130 0.0003345
-23:1 2 1 131 0.0003371
1-3:2 3 2 133 0.0003422
–3:2 3 1 134 0.0003448
123:0 0 1 139 0.0003576
123:0 1 0 139 0.0003576
1-3:1 2 1 141 0.0003628
-23:0 1 0 142 0.0003654
-2-:0 3 0 143 0.0003679
–3:1 2 0 144 0.0003705
-23:2 3 2 147 0.0003782
–3:1 0 2 154 0.0003962
-23:1 2 2 158 0.0004065
12-:1 3 0 163 0.0004194
1-3:1 2 2 169 0.0004348
12-:1 3 1 169 0.0004348
–3:1 2 2 172 0.0004425
1-3:0 0 1 173 0.0004451
123:2 2 1 176 0.0004528
123:1 0 2 177 0.0004554
12-:0 2 0 178 0.0004580
12-:0 2 1 178 0.0004580
-23:2 0 2 179 0.0004606
-23:1 3 0 180 0.0004631
-2-:0 3 2 183 0.0004709
1-3:0 1 0 185 0.0004760
-23:2 2 1 191 0.0004914
-23:2 3 0 194 0.0004992
–3:2 3 2 195 0.0005017
–3:2 3 0 198 0.0005094
–3:0 0 0 201 0.0005172
123:2 2 0 201 0.0005172
-23:1 1 2 204 0.0005249
12-:1 3 2 204 0.0005249
123:1 1 2 207 0.0005326
12-:2 3 1 210 0.0005403
-23:1 1 1 214 0.0005506
1-3:2 2 1 214 0.0005506
12-:2 3 0 215 0.0005532
–3:1 1 2 216 0.0005558
12-:0 2 2 218 0.0005609
-23:1 2 0 220 0.0005661
123:2 2 2 221 0.0005686
1-3:2 2 0 227 0.0005841
1-3:1 0 2 230 0.0005918
-2-:1 3 1 235 0.0006046
-2-:0 2 0 240 0.0006175
-2-:0 2 1 240 0.0006175
-23:2 2 2 240 0.0006175
1-3:1 1 2 240 0.0006175
123:2 0 2 243 0.0006252
–3:1 1 1 251 0.0006458
-23:2 2 0 256 0.0006587
1-3:2 2 2 259 0.0006664
–3:1 0 1 263 0.0006767
-23:1 0 1 264 0.0006793
–3:2 0 2 266 0.0006844
123:1 1 1 266 0.0006844
–3:2 2 1 271 0.0006973
-23:2 1 2 283 0.0007281
–3:2 2 0 287 0.0007384
-2-:0 0 2 287 0.0007384
1-3:1 1 1 287 0.0007384
-23:2 1 1 302 0.0007770
12-:2 3 2 306 0.0007873
–3:1 1 0 308 0.0007925
123:2 1 2 310 0.0007976
1–:0 3 1 311 0.0008002
1-3:2 0 2 315 0.0008105
-2-:1 3 0 317 0.0008156
12-:0 0 2 319 0.0008208
-2-:2 3 1 322 0.0008285
-23:0 0 0 328 0.0008439
-23:2 0 1 331 0.0008517
12-:1 2 0 333 0.0008568
123:1 1 0 338 0.0008697
1–:0 3 0 349 0.0008980
-2-:0 2 2 351 0.0009031
–3:2 2 2 362 0.0009314
-2-:1 3 2 369 0.0009494
12-:1 2 1 369 0.0009494
12-:0 1 2 378 0.0009726
123:1 0 1 378 0.0009726
-23:1 1 0 383 0.0009854
12-:0 1 1 390 0.0010035
1-3:1 0 1 391 0.0010060
1-3:2 1 2 391 0.0010060
123:2 1 1 391 0.0010060
1-3:1 1 0 404 0.0010395
–3:2 1 2 408 0.0010498
1–:1 3 1 409 0.0010523
123:0 0 0 420 0.0010806
1–:1 3 0 427 0.0010987
12-:1 2 2 438 0.0011270
-2-:0 1 2 439 0.0011295
-23:2 1 0 443 0.0011398
1-3:2 1 1 444 0.0011424
12-:2 2 0 444 0.0011424
1–:0 3 2 446 0.0011475
-2-:1 2 1 451 0.0011604
1–:2 3 0 455 0.0011707
-2-:0 1 1 464 0.0011939
1–:2 3 1 464 0.0011939
12-:2 2 1 472 0.0012144
–3:2 1 1 475 0.0012222
-2-:2 3 2 488 0.0012556
123:2 1 0 493 0.0012685
-2-:1 2 0 511 0.0013148
1-3:0 0 0 514 0.0013225
123:2 0 1 525 0.0013508
-2-:2 3 0 529 0.0013611
12-:1 0 2 529 0.0013611
-2-:1 0 2 532 0.0013688
1–:1 3 2 535 0.0013765
1–:2 3 2 540 0.0013894
12-:0 1 0 546 0.0014048
–3:2 0 1 573 0.0014743
12-:0 0 1 591 0.0015206
-2-:0 1 0 608 0.0015644
-2-:1 2 2 612 0.0015747
-2-:0 0 1 613 0.0015772
12-:2 2 2 626 0.0016107
1-3:2 1 0 627 0.0016132
1-3:2 0 1 636 0.0016364
-2-:2 2 1 641 0.0016493
–3:2 1 0 646 0.0016621
12-:1 1 2 647 0.0016647
12-:2 0 2 702 0.0018062
-2-:1 1 2 707 0.0018191
1–:0 2 0 707 0.0018191
-2-:2 0 2 745 0.0019169
1–:0 2 1 746 0.0019194
12-:1 1 1 753 0.0019374
–3:1 0 0 772 0.0019863
-2-:2 2 0 804 0.0020687
-2-:1 1 1 842 0.0021664
-23:1 0 0 846 0.0021767
12-:2 1 2 855 0.0021999
-2-:2 2 2 859 0.0022102
1–:1 2 0 886 0.0022796
1–:2 2 0 931 0.0023954
1–:0 2 2 958 0.0024649
—:2 3 0 981 0.0025241
-23:2 0 0 984 0.0025318
12-:1 1 0 988 0.0025421
12-:2 1 1 1001 0.0025755
-2-:2 1 2 1015 0.0026116
1–:1 2 1 1025 0.0026373
123:1 0 0 1037 0.0026682
1–:2 2 1 1038 0.0026707
—:2 3 1 1083 0.0027865
-2-:1 0 1 1088 0.0027994
-2-:2 1 1 1098 0.0028251
—:1 3 0 1120 0.0028817
12-:1 0 1 1144 0.0029435
1-3:1 0 0 1161 0.0029872
1–:1 2 2 1197 0.0030798
—:1 3 1 1224 0.0031493
-2-:1 1 0 1232 0.0031699
1–:0 0 2 1304 0.0033551
1–:2 2 2 1325 0.0034092
12-:2 1 0 1330 0.0034220
12-:2 0 1 1361 0.0035018
1–:0 1 2 1439 0.0037025
123:2 0 0 1441 0.0037076
-2-:2 0 1 1450 0.0037308
—:0 3 0 1537 0.0039546
1–:1 0 2 1574 0.0040498
–3:2 0 0 1608 0.0041373
1–:2 0 2 1625 0.0041811
-2-:2 1 0 1675 0.0043097
—:2 3 2 1704 0.0043843
12-:0 0 0 1711 0.0044023
—:0 3 1 1713 0.0044075
-2-:0 0 0 1714 0.0044101
1–:0 1 1 1726 0.0044409
1-3:2 0 0 1775 0.0045670
1–:1 1 2 1782 0.0045850
1–:2 1 2 1878 0.0048320
—:1 3 2 1982 0.0050996
—:2 2 0 2009 0.0051691
1–:1 1 1 2180 0.0056091
1–:0 1 0 2196 0.0056502
1–:2 1 1 2291 0.0058947
—:1 2 0 2363 0.0060799
—:0 3 2 2519 0.0064813
1–:0 0 1 2589 0.0066614
—:2 2 1 2597 0.0066820
1–:1 1 0 2714 0.0069830
1–:2 1 0 2894 0.0074462
—:1 2 1 2979 0.0076649
12-:1 0 0 2986 0.0076829
-2-:1 0 0 3042 0.0078270
1–:1 0 1 3210 0.0082592
—:0 2 0 3249 0.0083596
1–:2 0 1 3357 0.0086374
—:2 2 2 3538 0.0091031
—:2 0 2 3635 0.0093527
12-:2 0 0 3838 0.0098750
-2-:2 0 0 4015 0.0103304
—:0 2 1 4069 0.0104694
—:1 2 2 4150 0.0106778
—:1 0 2 4523 0.0116375
—:2 1 2 4724 0.0121547
—:2 1 1 5403 0.0139017
—:0 2 2 5559 0.0143031
—:1 1 2 6007 0.0154558
—:0 0 2 6092 0.0156745
—:2 1 0 6152 0.0158289
—:1 1 1 6543 0.0168349
1–:0 0 0 7081 0.0182191
—:2 0 1 7423 0.0190991
—:1 1 0 7463 0.0192020
—:0 1 2 7870 0.0202492
1–:1 0 0 8698 0.0223796
1–:2 0 0 8891 0.0228762
—:0 1 1 9039 0.0232570
—:1 0 1 9453 0.0243222
—:0 1 0 10277 0.0264423
—:0 0 1 12877 0.0331320
—:2 0 0 19268 0.0495758
—:1 0 0 24418 0.0628266
—:0 0 0 34262 0.0881549

Such small counts of observations within each stratum mean that long-run frequencies of runs through the half-inning will be too unreliable for decision-making. Let’s consider a different approach.

We know that four balls result in a walk and three strikes result in a strikeout.

What if, instead, we first estimate the runs through the half-inning for a given game state and, within that game state, the runs through the half-inning when either a walk or strikeout occurs? We have more information on walks and strikeouts given a game state:

Game State Walk Strikeout Observations Frequency
—:0 0 0 31962 0.1713624
—:0 0 1 10517 0.0563863
—:0 1 0 3518 0.0188616
—:1 0 0 22710 0.1217583
—:1 0 1 8183 0.0438727
—:1 1 0 2677 0.0143526
—:2 0 0 17773 0.0952889
—:2 0 1 6611 0.0354445
—:2 1 0 2267 0.0121544
–3:0 0 0 229 0.0012278
–3:0 0 1 57 0.0003056
–3:0 1 0 25 0.0001340
–3:1 0 0 940 0.0050398
–3:1 0 1 238 0.0012760
–3:1 1 0 128 0.0006863
–3:2 0 0 1438 0.0077098
–3:2 0 1 541 0.0029005
–3:2 1 0 328 0.0017586
-2-:0 0 0 1958 0.0104977
-2-:0 0 1 506 0.0027129

And if we assume that the accumulation of balls and strikes affect the probability of a walk or strikeout similarly across game states, then we have magnitudes more observations:

Balls Strikes Observations Frequency
0 0 131011 0.3370864
0 1 49016 0.1261163
0 2 23829 0.0613111
1 0 42252 0.1087128
1 1 34713 0.0893153
1 2 30335 0.0780508
2 0 14409 0.0370738
2 1 16355 0.0420808
2 2 21770 0.0560134
3 0 7448 0.0191634
3 1 7100 0.0182680
3 2 10419 0.0268077

We can jointly estimate the run value of a walk or strike, discounted by the probability of accumulating balls or strikes to obtain the result. We can model these as having an underlying random process. As these are count data, modeling our unknowns as Poisson distributions may be a helpful place to start. Here, I describe the parameters as over-dispersed, and use a negative binomial to reflect any differences between the mean and variation of these unknowns:

\[ \begin{equation} \begin{split} \text{NegBin}(\text{runs}_{\text{game state}} &\mid \boldsymbol\mu_{\text{game state}}, \phi_{\text{game state}}) \\ \\ \text{NegBin}(\text{runs}_{\text{game state}, \text{outcomes}} &\mid \boldsymbol\mu_{\text{game state}, \text{outcomes}}, \phi_{\text{game state}}) \\ \end{split} \end{equation} \]

and model \(\mu_{\text{game state}, \text{outcomes}}\) as depending on \(\mu_{\text{game state}}\),

\[ \begin{equation} \begin{split} \text{exponential}(\boldsymbol\mu_{\text{game state}, \text{outcomes}} &\mid \frac{1}{\boldsymbol\mu_{\text{game state}} \cdot \boldsymbol\lambda_{\text{outcomes}}}) \\ \\ \text{exponential}(\boldsymbol\mu_{\text{game state}} &\mid 1) \\ \\ \text{Normal}(\boldsymbol\lambda_{\text{outcomes}} &\mid 1, 0.1) \\ \end{split} \end{equation} \]

where \(\boldsymbol\lambda\) is a vector for each outcome. Expectations of the fitted model parameters \(\mu_{\text{game state}}\) are:

Gamestate Marginal Walk Strikeout None Single Double Triple Homerun
No outs
—:0 0.52 0.82 0.26 0.31 0.9 1.14 1.45 1.53
1–:0 0.91 1.43 0.5 0.53 1.53 2.03 2.44 2.57
-2-:0 1.16 1.64 0.7 0.88 1.79 2.21 2.33 2.53
12-:0 1.49 2.12 0.99 1 2.44 3.04 3.45 3.59
–3:0 1.42 1.7 1.1 1.18 1.83 2.29 2.08 2.52
1-3:0 1.8 2.45 1.15 1.47 2.55 3.2 3.14 3.62
-23:0 1.99 2.44 1.43 1.67 2.91 3.16 2.96 3.68
123:0 2.28 2.98 1.59 1.8 3.23 3.96 4.34 4.27
One out
—:1 0.28 0.48 0.1 0.13 0.55 0.72 0.99 1.27
1–:1 0.54 0.91 0.21 0.23 1.02 1.53 1.98 2.28
-2-:1 0.7 1.07 0.35 0.37 1.37 1.78 2.04 2.3
12-:1 0.96 1.51 0.45 0.47 1.83 2.56 2.72 3.32
–3:1 0.99 1.53 0.42 0.77 1.52 1.72 1.85 2.31
1-3:1 1.22 1.87 0.57 0.84 2.08 2.48 3.04 3.36
-23:1 1.41 2.01 0.67 1.12 2.34 2.68 3.2 3.29
123:1 1.57 2.5 0.76 1 2.81 3.52 3.93 4.19
Two outs
—:2 0.11 0.22 0 0.01 0.23 0.34 0.39 1.11
1–:2 0.23 0.47 0 0.02 0.47 0.96 1.48 2.11
-2-:2 0.33 0.61 0 0.03 1.04 1.33 1.35 2.11
12-:2 0.45 0.91 0 0.04 1.25 1.95 2.29 3.09
–3:2 0.36 0.71 0 0.03 1.24 1.33 1.48 2.11
1-3:2 0.51 1 0 0.05 1.49 1.92 2.29 3.1
-23:2 0.57 1.05 0 0.06 2.05 2.34 2.18 2.99
123:2 0.78 1.65 0 0.08 2.27 3.06 3.34 4.05

As the marginal considers all events, not just walks and strikeouts, its values conditional on game state differ somewhat from values for takes in an 0-0 count. Of note, the above conditional runs also assume that the batter’s ability to reach bases, e.g., reaching a double, is independent of base runners, which of course isn’t true but useful for simplifying this initial model.

Next we model the affects of balls and strikes on the probability of a walk or strikeout. If the probability of a strike \(\pi\) was constant across counts, we could model the probability of a walk as binomial,

\[ \begin{equation} \begin{split} \pi_{\text{strikeout}}(\pi, b, s) = \binom{3 - b + 2 - s}{2 - s} (1 - \pi)^{3 - b} \pi^{2 - s} \cdot \pi \\ \end{split} \end{equation} \]

which we could model remaining runs that half inning for counts with a take as,

\[ \begin{equation} \begin{split} &\text{NegBin}(\text{runs}_{\text{takes}} \mid \boldsymbol\mu, \phi) \\ \\ \end{split} \end{equation} \] where

\[ \begin{equation} \begin{split} &\boldsymbol\mu = \pi_{\text{strikeout}}(\pi, b, s) \cdot \mu_{\text{game state, strikeout}} + (1 - \pi_{\text{strikeout}}(\pi, b, s)) \cdot \mu_{\text{game state, walk}} \\ \end{split} \end{equation} \]

but we know from game strategy that \(\pi\) changes across counts. With a 12-vector of strike probabilities \(\boldsymbol\pi\), I expand the binomial recursively through functions \(f\) and \(g\),

\[ \begin{equation} \begin{split} &\boldsymbol\mu = f(\boldsymbol\pi, b, s, 0) \cdot \mu_{\text{game state, strikeout}} + (1 - f(\boldsymbol\pi, b, s, 0)) \cdot \mu_{\text{game state, walk}} \\ \end{split} \end{equation} \] Recursive function \(f\) is

\[ \begin{equation} \begin{split} f(\boldsymbol\pi, b, s, b_0) = \begin{cases} g(\boldsymbol\pi, b, s, b_0),& \text{if } b_0{ = 3 - b} \\\\ g(\boldsymbol\pi, b, s, b_0) + f(\boldsymbol\pi, b, s, b_0 + 1),& \text{otherwise} \end{cases} \end{split} \end{equation} \]

and in turn function \(g\) is,

\[ \begin{equation} \begin{split} g(\boldsymbol\pi, b, s, b_0) = \begin{cases} \displaystyle \prod_{S = s}^{2}\pi_{[b,S]},& \text{if } b_0 { = 0} \\ \\ \displaystyle \prod_{B = b}^{b + b_0} (1 - \pi_{[B,s]}) \cdot \pi_{[b_0, 2]},& \text{if } s = 2 \\\\ \displaystyle \sum_{\text{combns}}\left[ \displaystyle \prod_{\dot{b}, \dot{s}} (1 - \pi_{[\dot{b},\dot{s}]}) \cdot \pi_{[\dot{b},\dot{s}]} \right] \cdot \pi_{[\dot{b},2]} ,& \text{if } s = 1 \\\\ \displaystyle \sum_{\text{combns}}\left[ \displaystyle \prod_{\dot{b}, \dot{s}} (1 - \pi_{[\dot{b},\dot{s}]}) \cdot \pi_{[\dot{b},\dot{s}]} \cdot \pi_{[\dot{b},\dot{s}]} \right] \cdot \pi_{[\dot{b},2]} ,& \text{if } s = 0 \\\\ \end{cases} \end{split} \end{equation} \]

where \(\sum_{\text{combns}} \left[ \cdot \right]\) represents the sum of \(3 - b + 2 - s \choose 2 - s\) ball-strike combinations1. For priors here, I roughly assume that the chance of a strike on any given pitch would be a random coin flip (e.g., the edge of the called-strike zone), which I represent as \(\text{Normal}(0.5, \sigma)\) and in the posterior we see how \(\boldsymbol\pi\) varies across counts:

count
p(strike)
p(strikeout)
balls strikes prior posterior prior posterior
0 0 0.5 0.58 0.66 0.51
0 1 0.5 0.41 0.81 0.57
0 2 0.5 0.17 0.94 0.68
1 0 0.5 0.60 0.50 0.41
1 1 0.5 0.50 0.69 0.50
1 2 0.5 0.23 0.88 0.62
2 0 0.5 0.69 0.31 0.29
2 1 0.5 0.65 0.50 0.38
2 2 0.5 0.35 0.75 0.50
3 0 0.5 0.65 0.12 0.10
3 1 0.5 0.66 0.25 0.15
3 2 0.5 0.23 0.50 0.23

And this makes sense. In a 3-ball count, for example, batters are less likely to swing so the pitcher should more often throw a strike whereas when strikes increase the risks to the batter shift and pitchers throw more balls tempting the batter to swing outside the zone (and miss).

The table of expected runs initially shown are the means from the distributions of possible outcomes.

To calculate the expected change in runs from a called ball or strike, then, we subtract the expected runs of the current conditions from those conditions with an added non-final ball or strike, respectively, or if a final ball or strike, then from a walk or strikeout:

\[ \begin{equation} \begin{split} \mathbb{E}(\Delta\text{runs}_{\text{take}}) = \begin{cases} \boldsymbol\mu_{\text{after}} - \boldsymbol\mu_{\text{before}},& \text{for non-final counts} \\ \\ \boldsymbol\mu_{\text{game state, strikeout}} - \boldsymbol\mu_{\text{before}},& \text{for strikeouts} \\ \\ \boldsymbol\mu_{\text{game state, walk}} - \boldsymbol\mu_{\text{before}},& \text{for walks} \\ \\ \end{cases} \end{split} \end{equation} \]

Total expected runs sum the probability of a take times the expected runs of a take, above, with the expected runs of all other possible outcomes of a pitch, weighted by their probabilities.


  1. The binomial PMF does not work here as we expect the probabilities of throwing a strike to differ across counts. Thus, we have to expand all combinations, each ending in a strike.↩︎

Citation

For attribution, please cite this work as

Spencer (2020, June 11). Estimating utility of called pitches. Retrieved from https://ssp3nc3r.github.io/publications/pitch-utility.html

BibTeX citation

@misc{spencer2020estimating,
  author = {Spencer, Scott},
  title = {Estimating utility of called pitches},
  url = {https://ssp3nc3r.github.io/publications/pitch-utility.html},
  year = {2020}
}