Paul Grinvalds’ Boston Cutoff Prediction

paul-grinvalds-boston-cutoff-prediction

Readers who submit an entry into our Guess the 2019 Boston Marathon Cutoff contest are invited to submit the reasoning behind their entry for publication. (Have you entered yet?)

 

Every September, when registration for the Boston Marathon opens up, a significant number of potential runners have to wait anxiously to see whether their mBQ (maybe Boston Qualifying) time will beat the cutoff and be good enough to get them in the race.

The cutoff is the amount of time by which a runner must beat their Boston qualifying standard to be accepted into the race. For example, if the cutoff time is 2:00 and your age group’s Boston qualifying time is 3:25, then you need a time of 3:23 or better to be accepted.

One of the ways people deal with the waiting is by developing estimates of the cutoff for the upcoming Boston Marathon. Some guesses are based purely on a hunch while others are based on statistical analysis. What all predictions for the past few years have in common is that they have understated, often significantly, the actual cutoff.

 

Cutoff Unpredictability

It seems that each year there is a new factor that impacts the cutoff that was not predicted. For example, in 2017 the BAA reduced the number of accepted time-qualifier entries by 818 individuals. The reduced number of accepted entries increased the cutoff time by approximately 36 seconds.

Predictions have been developed by comparing the number of qualifiers from the past year to the current year for some of the larger feeder races. One issue with this method was the number of newer smaller races (e.g. the BQ.2 races in Geneva and Grand Rapids), created to help people achieve Boston qualifying times, increased the number of total qualifiers but weren’t included in the sample. This likely understated the total number of qualifiers.

Findmymarathon.com developed an estimate for the cutoff for the 2018 marathon by looking at the total number of mBQs for 2018 compared to past years along with some other data. Using the total number of qualifiers rather than just looking at a sample of the larger feeder races should provide a more accurate estimate. However, what they did not seem to be take into account was the increase in the demand factor (the percentage of total qualifiers that register). That increased from 48.5% for 2017 to 55.4% for 2018, resulting in 3,543 more applicants than expected, which contributed to increasing the cutoff by approximately two minutes over their estimate.

 

Cutoff Prediction Methodology

There are four parameters that determine what the cutoff will be. These are:

  • Number of potential qualifiers (mBQs)
  • Demand percentage
  • Number of accepted applications
  • Runners per second in the cutoff range

We can get a good estimate of the first parameter from marathon aggregator sites such as Findmymarathon.com. They have not published a figure yet this year since the qualifying period has not ended. However, based on sampling larger feeder races, it appears that the number of mBQs will increase for the 2019 Boston Marathon as compared to 2018.

The demand percentage may be the most difficult parameter to estimate. The figure has ranged from 44.0% in 2015 to 55.4% in 2018. It appears this is on an increasing trend. Trying to determine why would be pure speculation. More data will be needed before we know if the large increase from 2017 to 2018 was a one-time surge or just a part of a generally increasing trend.

The number of applications accepted is not necessarily predictable, but that figure has been relatively stable in recent years. Accepted applications have ranged from 22,679 in 2014 to 24,032 in 2016. This can vary based on overall field size, but the BAA typically announces the field size prior to registration, providing some insight to help make predictions.

The fourth parameter is the runners per second in the cutoff range. This can be determined by dividing the number of runners cut by the cutoff time. For example, in 2015 1,947 runners were cut and the cutoff time was 1:02 (or 62 seconds). Dividing 1,947 by 62 gives us a result of 31.4 runners per second. As with demand percent, this is not a predictable factor and has varied considerably in recent years.

The table below shows the historical data for the parameters I will use to develop estimates:

 

* Findmymarathon.com estimates

Developing Predictions

When parameters are not predictable a useful technique is to develop ranges rather than a single point estimate.

Number of Qualifiers

Based on a sampling of feeder races my estimate is that the number of mBQs will increase by 11.6% from the prior year, which implies that there will be 56,916 total potential qualifiers. This estimate can be refined with more complete data when the qualification period ends. While I am suggesting using ranges, I don’t for the number of mBQs since that is a known figure when the qualifying period ends.

 

Demand Percentage

As mentioned, this is unpredictable. So to develop a range of estimates I will use a high and low range. For the high range I will use 58%, which is a slight increase from 55.4% last year. For a low range I will use 44%, which is the lowest figure in the past four years. Since there has been a generally increasing trend, it seems more plausible that the figure this year could exceed the figure from last year, but it would be extremely unlikely to be lower than 44%.

 

Accepted Applications

I will assume a high of 24,000 and a low of 23,000, which assumes the total size of the field does not change.

Runners per second

I will use the highest and lowest figures in the past four years so the high figure is 31.4 and the low figure is 22.9. Note that is this case, the high figure of 31.4 produces a lower cutoff time than the 22.9 figure so in this case the lower figure is the conservative estimate.

The first step in determining the cutoff is to estimate the total number of applications. We get this by multiplying the number of mBQs by the demand percentage figures which gives us the following:

 

To get the number of runners cut, we subtract the accepted applications from total applications. Since we have two estimates of the number of applications and two estimates for the number of accepted applications, we will have four results for the number of runners cut. These are as follows:

 

The last step is to take each of the number of runners cut and divide by runners per second to get the cutoff times. Those results are as follows:

 

Using this methodology we get cutoff times ranging from 33 seconds to 7:36.

What if Scenarios

Since the range is so broad, it might not be too helpful for most people trying to determine their chances for getting accepted. What might be more helpful is to work backwards to see what assumptions are necessary for a given qualifying time to be accepted and then making a determination regarding the reasonableness of those assumptions.

For example, assume a runner has beaten their qualifying time by 5 minutes. What demand assumption is necessary for a mBQ -5:00 to be accepted if we make conservative assumptions for applications accepted (23,000) and runners per second (22.9)? In this case, a demand percentage of 52% results in a cutoff time of 5 minutes.

Interestingly enough, using the mid-range of each assumption (i) 51% demand rate, (ii) 23,500 applications accepted, and (iii) 27.15 runners per second results in a cutoff of 3:23, which was the cutoff for 2018.

My gut, however, tells me it will be higher than that. My actual prediction is 5:45. This prediction is not totally random. I have a mBQ of -5:46 in the bank and I don’t want to predict that I won’t get in!