Mathematical Runner’s Prediction for the 2019 Boston Marathon Cutoff

mathematical-runners-prediction-for-the-2019-boston-marathon-cutoff

There’s still time to enter our “Guess the 2019 Boston Marathon Registration Cutoff” Contest! Beat our prediction, win bragging rights. Be one of the top three entries, win actual, real prizes! Entries close Sept. 9.

Predicting the Boston Marathon qualifying cutoff times is a fool’s game. There’s too much uncertainty to predict the cutoff to the necessary precision when a single second can make the difference between qualifying and a nice try. But here at Mathematical Runner, we are nothing if not foolish, so here we go:

To predict the Boston Marathon cutoff time, we want to know:

  • How many qualified runners submit an application
  • How many qualified runners the BAA will admit
  • The distribution of the applicants’ qualifying times

We have historical data of varying quality for these along with past cutoff times, which we’ll use to make our prediction. We’ll work with numbers since 2014, after the BAA last changed the qualifying standards and the events of 2013.

Here’s the data for applicants, acceptances, and cutoff times (via Runner’s World) which we’ll use to develop and validate our model:

 

The easiest thing to do is a linear projection of previous cutoff times. The result for 2019 using that method is 3:31. But that same method predicted a cutoff of 2:34 for 2018, which was off by almost a minute. We should be able to do better than that.

The Law of Supply and Demand tells us that the cutoff time should go up as demand (the number of applicants) increases and down as supply (the number of runners accepted) increases. So to predict the 2019 cutoff, we’ll start by predicting those numbers.

To figure out the number of applicants for 2019, we start by estimating how many runners have run what I call a mBQ (maybe Boston Qualifier, a race faster than the qualifying times posted by the BAA) in the year prior to registration in September 2018. We break that total down into three main components:

  • Runners who qualified in 2017 marathons held after September’s registration for Boston 2018
  • Runners who qualified at the 2018 Boston Marathon
  • Runners who qualified in 2018 marathons other than Boston

We don’t have 100% accurate counts for any of those, but we do have the lists of the top thirty races with the most qualifiers that MarathonGuide.com publishes for each year. http://www.marathonguide.com/races/BostonMarathonQualifyingRaces.cfm

This doesn’t include every qualifier of course, but our assumption is that it includes enough of them to make any patterns visible. Don’t worry; we’ll be making a LOT more assumptions along the way to our predicted cutoff.

Using our historical data, let’s add the three components together:

 

When you compare the number of applications to the three components, you can see that they track together year by year. The large number of mBQs at Boston has an excessive impact on the sum. We adjust for that by multiplying the Boston mBQs by a fraction (0.34443, found mostly by trial and error).

One thing stands out in particular. While the sum has trended downward (in line with total marathon finishes over the past few years), applications for Boston are still trending upward. There’s some unknown factor driving up Boston apps, which we calculate by finding the slope of the line created by plotting the difference between the sum and the number of applications, and then using that to adjust our estimate.

When we do that, the trendlines for the number of apps and the estimate match up, and the total error drops under 2%. Not bad.

 

To predict the number of applications for 2019, we have everything we need except the number of runners who’ve qualified in 2018 marathons other than Boston. Looking back shows that lists from previous years had anywhere between 12 and 17 races from before registration, averaging about 14. So I added up the top 14 from the current list (excluding Boston, of course). One of the “last chance” races might still make the list, but the difference won’t be that much.

Anyhow, plug in all the numbers for 2019, and we end up with an estimate of 31401 applications.

We also need to know the number of qualifiers that the BAA will allow into the race. Mathematical Runner does not have any better connections at the BAA than you do, so we’ll have to guess. We might as well keep it simple. Using the same four years we used to project the number of applicants, Excel’s FORECAST function estimates the number of acceptances as 23032. That seems like it might be low, so we will arbitrarily change the estimate to 23200, more in line with the last two years.

Now, how do we get from the number of applicants and the number of applications accepted to the cutoff time?

We start with the knowledge that marathon results are distributed in something like a bell-shaped curve. There are very few people who can run a 2:10 marathon, many more who can run a 4 hour race, and few who can (or choose to) run 6 hour marathons. The shape of the curve tends to be a little front-loaded (especially for men), with a long, slow tail.

 

Our assumption is that the distribution of mBQ times for runners who actually submit an application is the same as the distribution for all runners. Potential Boston Marathon qualifiers come from the faster end of the curve. Let’s take a closer look at that.

 

The area to the left of the mBQ line represents the total number of applicants with a mBQ, while the smaller area to the left of the cutoff line represents the number of accepted qualifiers. (Many people finish just under their mBQ time.)

A single runner can have more than one mBQ. We’ll assume we can ignore that. Anyhow, most of the runners with multiple marathon finishes come from the slower side of the curve.

If we knew the formula for the distribution curve, we could calculate the area under the curve by taking the integral of the curve from the fastest runners to the nominal qualifying times. But don’t worry, a few more assumptions will let us skip the calculus.

The first assumption we’ll make is that for the area in question, the shape of the curve can be usefully approximated by a straight line. That makes the area under the curve into a triangle, which makes the math much simpler.

 

Then geometry (I know, but at least it is easier than integral calculus) tells us that the area of the triangle representing the number of runners who submitted applications is ½*Q*H, while the area of the triangle representing the number of runners accepted is ½*Q1*H1. Q1 is just Q minus the cutoff time, of course. And trigonometry (still easier than calc) tells us that Q/H=Q1/H1.

The second assumption is that Q, the side of triangle along the horizontal axis representing the amount of time between the fastest runners and the qualifying standards, is one hour (60 minutes). We get there by noticing that the fastest runners are just about an hour faster than the open qualifying time, and assuming (there’s that word again) that the same gap applies for all genders and age groups. (That’s reasonably close to being true.) So Q1 is 60 minutes minus the cutoff.

Given all that, we can solve for the cutoff time in terms of the data we know, then do some curve-fitting to match the historical data. I’ll leave the details as an exercise for the student.

Our process does a reasonable job of mapping to the actual cutoffs from previous years.

 

Plug in the numbers we’ve calculated for 2019, and Mathematical Runner predicts that the cutoff time for the 2019 Boston Marathon will be…. 5:16. If that’s correct, and all our other estimates are also true, then 8204 applicants will miss the cut.

Both of these numbers are much higher than we’ve seen in the past. Two things stand out to us as factors driving the increases:

First, there was a significant increase in the number of mBQs during 2018, driving the cutoff estimate up. The number requalifying at Boston rose, and even though we didn’t include the last chance races in September, the number of mBQs at non-Boston races went up even more. The numbers may not be exact, but we’re pretty confident in that factor.

Second, as we noted earlier, while the raw number of mBQs has trended downward (in line with total marathon finishes over the past few years), applications for Boston were still trending upward, so we had to add in a curve-fitting factor to make the historical numbers work. That also drives the cutoff estimate up.

Obviously, that trend (fewer potential qualifiers, more Boston applications) can’t continue forever. If you project it forward far enough, you end up with more applications than there are qualifiers. That throws the validity of our curve fitting factor into question.

Does the trend end this year? Who knows? But our gut tells us that fitting to only four data points using an obviously invalid method is not the way to go. We’re leaving out the final curve fitting when making our prediction.

So (drumroll, please)…

Mathematical Runner predicts that the cutoff time for the 2019 Boston Marathon will be: 4:03.

That’s still higher than ever before. We hope your mBQ is good enough to get you in!