Let’s practice expressing a probabilistic model aided by physics. Physics have been used, of course, to mathematically describe the forces involved in running short distances. To estimate sprint speed we can inform our models with basic anatomical information and physics. The mathematical physics of running speed depend on body weight, and stride length and rate. Let’s begin with those.

Let’s start with physics equations to describe speed, and accelerating to maximum speed. We can express this by considering acceleration per stride, as explained in Herman (2016). Imagine stepping into the starting blocks on a 100m sprint. On the first push with our legs, we accelerate to a speed \(v_1\) at the end of the stride from a force \(F\) moved by a distance \(L\). To accelerate, we must produce the required work \(W\), as force \(F\) times stride length \(L\) — all together, \(W = FL\) — further expressed by the kinetic energy in our body, \(\frac{1}{2}ma\), or more specifically, \(\frac{1}{2}(m_{\text{b}} - 2m_{\text{leg}})v_1^2\), with one leg in the air \(\frac{1}{2}m_{\text{leg}}v_1^2\). After the first stride, then, we can express our speed or velocity as,

\[ v_1^2 = \frac{2FL}{m_{\text{b}}- m_{\text{leg}}} \]

In the next stride, we’ve switched feet doing the work, again \(FL\), but accelerate the body from \(v_1\) to \(v_2\) and the first leg from \(0\) to \(v_2\), thus,

\[ v_2^2 = \frac{2FL}{m_{\text{b}} - m_{\text{leg}}} \left( 1 + \frac{m_{\text{b}} - 2m_{\text{leg}}}{m_{\text{b}} - m_{\text{leg}}} \right) \]

For \(n\) strides,

\[ v_n^2 = \frac{2FL}{m_{\text{b}} - m_{\text{leg}}} \left( 1 + \frac{m_{\text{b}} - 2m_{\text{leg}}}{m_{\text{b}} - m_{\text{leg}}} + \left( \frac{m_{\text{b}} - 2m_{\text{leg}}}{m_{\text{b}} - m_{\text{leg}}} \right)^2 +\dots+\left( \frac{m_{\text{b}} - 2m_{\text{leg}}}{m_{\text{b}} - m_{\text{leg}}} \right)^n\right) \]

Notice that these accelerations follow the pattern of a geometric series \(1 + x + x^2 + \dots + x^n = (1 - x^n) / (1 - x)\) for \(0 < x < 1\). Thus, the series approaches a final running velocity of,

\[ v_{n \rightarrow \infty}= \sqrt \frac{2FL}{m_{\text{leg}}} \]

Empirical studies provide an expected leg mass \(m_{\text{leg}}\) around \(0.161m_{\text{b}}\) see Herman (2016). As with leg mass, empirical studies have estimated stride length as a function of human height, though stride length will also depend on run technique and generated forces. Those experiments measured stride lengths during short distances between around 1.14 to 1.35 times runner height in meters, see Rompottie (1972) and Hoffman (1972). To estimate a new run speed, we can estimate our force, given observations of previous sprints through modeling.

Let’s code the beginning of a simplified model that reflects these measures and assumptions using Stan, a probabilistic programming language. Take some time to compare the mathematical description above with the code below:

data {

// runner and event information

  int<lower=0> N;          // count runs
  array[N] int<lower=0> r; // runner index
  vector<lower=0>[N] t;    // time s
  vector<lower=0>[N] d;    // distance m
  vector<lower=0>[N] m;    // body mass kg
  vector<lower=0>[N] h;    // body height m

}

transformed data {
  int n_r = max(r);                     // count unique runners
  vector<lower=0>[N] L = 1.35 * h;      // stride length, m
  vector<lower=0>[N] m_leg = 0.161 * m; // lifted leg weight, kg
}

parameters {
  vector<lower=0>[n_r] F; // force each runner, Newtons (kg m / s^2)
  real<lower=0> F_pop;    // force population, Newtons (kg m / s^2)
  real<lower=0> sigma;    // overall variation, m / s
}

model {

  // priors
  F_pop ~ normal(250, 20);
  F ~ normal(F_pop, 50);
  sigma ~ exponential(5);

  // likelihood  
  vector[N] mu = sqrt(2 * F[r] .* L ./ m_leg);
  target += normal_lpdf( d ./ t | mu, sigma);
}

generated quantities {
  array[N] real 
    t_hat = normal_rng(sqrt(2 * F[r] .* L ./ m_leg), sigma);
}

Let’s fit this simplistic model to, say, winning Olympic male sprinters. We can pull their race information from Wikipedia, along with each winner’s height and weight from their biographies with the caveat that those anthropometric values are general and do not necessarily reflect their height and weight at the time of their winning race. Thus, this model fit is merely illustrative. Here are the data:

Year	Winner	Time	Height	Weight
1896	Tom Burke (USA)	12	1.83	66
1900	Frank Jarvis (USA)	11	1.67	58
1904	Archie Hahn (USA)	11	1.67	64
1908	Reggie Walker (SAF)	10.8	1.70	61
1912	Ralph Craig (USA)	10.8	1.82	73
1920	Charles Paddock (USA)	10.8	1.71	75
1924	Harold Abrahams (GBR)	10.6	1.83	75
1928	Percy Williams (CAN)	10.8	1.70	56
1932	Eddie Tolan (USA)	10.38	1.70	65
1936	Jesse Owens (USA)	10.3	1.80	75
1948	Harrison Dillard (USA)	10.3	1.78	69
1952	Lindy Remigino (USA)	10.4	1.68	63
1956	Bobby Morrow (USA)	10.5	1.86	75
1960	Armin Hary (GER)	10.2	1.82	71
1964	Bob Hayes (USA)	10	1.80	84
1968	Jim Hines (USA)	9.95	1.83	81
1972	Valeriy Borzov (URS)	10.14	1.83	80
1976	Hasely Crawford (TRI)	10.06	1.87	90
1980	Allan Wells (GBR)	10.25	1.83	86
1984	Carl Lewis (USA)	9.99	1.88	80
1988	Carl Lewis (USA)	9.92	1.88	80
1992	Linford Christie (GBR)	9.96	1.88	92
1996	Donovan Bailey (CAN)	9.84	1.85	91
2000	Maurice Greene (USA)	9.87	1.76	75
2004	Justin Gatlin (USA)	9.85	1.85	83
2008	Usain Bolt (JAM)	9.69	1.95	94
2012	Usain Bolt (JAM)	9.63	1.95	94
2016	Usain Bolt (JAM)	9.81	1.95	94
2021	Lamont Marcell Jacobs (ITA)	9.8	1.86	84

Let’s fit the model with these data.

Running a posterior predictive check, we find that our estimates include the data:

Here are our estimates of the forces each winner produced:

Now, in this simplified example, I included prior experimental values for leg mass \(m_{\text{leg}}\) and stride length \(L\), but these are not precise for our particular sprinters. We can propagate variation and uncertainty in these by estimating them as random variables, informed by past observations. Further, the prior information assigned was illustrative, and not informative. With more care in those priors, we can more accurately model these forces. Of note, we estimated each winner’s forces as separate parameters, informed by a hyper-prior of the average forces generated by the population.

What can we use these parameters for? Well, we might draw from a particular winner’s estimates to see what a new race may have resulted in. Or we may use the population information to estimate what a new winning sprinter may have generated. But, certainly, there are limitations in the use of this model as is. If we were to extend the model, better uses might be to try to include models for, say, maximum expected forces possible in tomorrow’s Olympic races. To get started, consider some ideas in Noubary (2010). Or we may consider including models of change in runner forces as they age.

Stay curious.

References

Herman, Irving P. 2016. Physics of the Human Body. Second. Biological and Medical Physics, Biomedical Engineering. Switzerland: Springer International Publishing. https://doi.org/10.1007/978-3-319-23932-3.

Hoffman, K. 1972. “Stature, Leg Length, and Stride Frequency.” Track Technique, no. 46: 1463–69.

Noubary, Reza. 2010. “What Is the Speed Limit for Men’s 100 Meter Dash.” In Mathematics and Sports, edited by Joseph A. Gallian, 287–94. Dolciani Mathematical Expositions, no. 43. Washington, DC: Mathematical Association of America.

Rompottie, K. 1972. “A Study of Stride Length in Running.” International Track and Field, 249–56.

Modeling forces in 100m Olympic sprint, a study in physics and probability

References