Men’s Team Predictions for 2021 NCAA D1 XC Championships

Saturday marks the running of the 2021 NCAA D1 Cross Country National Championships in Tallahassee with Northern Arizona on the brink of their fifth team title in six years.

It has been a while since I have publically published any new models in any sport, but yesterday on Twitter, Citius Magazine posted something about a video they had done with Isaac Wood at The Wood Report on his prediction.

Being relatively new to the world of collegiate track and cross country, I had no idea who Isaac was and immediately went and subscribed to his website to see what he had built. I also like to see how others model sport and Isaac has an interesting website.

From both their tweet and checking out the website and his simulator, I began to wonder what I could produce before Saturday’s meet. My initial thought was to create my own individual runner ratings and simulate from there, but to be honest, that is something I have been thinking about for a few months now and is just too big a project. What I could do is take Isaac Woods’s individual runner rankings and try to expand on the team result.

PROJECT OVERVIEW

I decided I would build a quick Monte Carlo simulator using the top-7 runners for each team racing Saturday, based on The Wood Report, and calculate probabilities for how each team will do. I also figured I might as well look at individual runners and how the top-10 may look. (HUGE CAVEAT – I am not including any of the individual runners who are not competing within the team standings. I just didn’t have enough time to build everything from scratch.)

EDITOR’S NOTE – While writing this, I realized that most of you could care less about the methodology section, so please feel free to skip all of that and see what I came up with. I understand. I do these types of projects mostly to share my thinking so I can improve my methods at a later date as questions/data improvements come.

METHODOLOGY

The first step is to take the 31 teams and find the top-7 runner ratings. With such a short time horizon and not really having a chance to build my own ratings, I have to combine a little art to the science. The main weakness of doing something like this is that I have no idea how Isaac Wood (and his PhD student) created these and really no idea about the variability of each individual runner.

Cross Country is such a great event because every course is different every day. Hills, terrain, altitude, temperature and humidity vary even from day-to-day on the same course. I am not even going to get into teams racing at a variety of distances leading up to this weekend or where individual runners were within their training when they raced various events. That is not being captured here.

We have what we have. Each runner has been given a rating that from my point of view appears to be really solid.

What I can do is simulate variability. This is where art blends with science.

Let’s take BYU’s Connor Mantz with a rating of 9.97. Sure, he’s a favorite, but how much? In reality, we would have all of the variables I mentioned above already baked into his rating, plug Saturday’s expected variables into a formula, and see what his expected time would be. We would then do that for every runner and build the projected results.

But I don’t have that. I have a bunch of individual ratings. Wood simply uses those to build a final result. Instead of doing that, I prefer to throw some variability into the ratings and run the race thousands of times.

How much variability and where do you model this?

Back to Mantz. Sure he’s a favorite, but there are several guys who can win this race. Some people will have the race of their lives while others will struggle for one reason or another (Think of the three H’s: hills, heat and humidity).

I took the top runner for each school participating and found the standard deviation. I did this as well for the rest of the runners. Not surprisingly, as you go from the first runner for each team to the seventh, the variability explodes. This makes sense. Depth is where this thing is won.

I finally settled on a number closer to the standard deviation of the top runners for each team and used it for every runner in the field. This isn’t the best method, as each runner should have their own variance, but it’ll do.

FINALLY, I use a function to generate a random number within +/- 1 ‘standard deviation’ for each runner to figure out their ‘speed’ for the race, rank the runners and score the race. To understand this better, imagine some runners will run better and some worse, but they won’t always run right at their rating. Odds are they will be somewhere close. Of course there will be outliers, but let’s assume they stay ~ +/-34% of their rating in a ‘standard’ way. I also don’t want to talk about Outliers too much as this may send shivers down Chris Chavez’ spine. #TeamGladwell. just Kidding.

I do this 10,000 times.

That is, I simulate the race 10,000 times and see how it all plays out. This should give us a pretty good indication of the probability each team has to finish this weekend in a particular place.

TEAM RESULTS

After running the simulation 10,000 times, the overwhelming favorite is Northern Arizona who wins the title 48.78% of the time. Oklahoma State captures the title 22.64%, with Iowa State winning 12% of the time.

Below shows how many times each team placed in the team standings.

SCHOOL	1ST	2ND	3RD	4TH	5TH
Northern Arizona	4878	2479	1350	722	380
Oklahoma State	2264	2368	1978	1410	978
Iowa State	1200	1728	1782	1688	1501
Colorado	705	1257	1643	1782	1651
Notre Dame	573	1128	1538	1725	1772
BYU	280	637	946	1329	1702
Stanford	92	370	670	1093	1519
Tulsa	8	33	93	251	495

How good is Northern Arizona? They have over and 87% chance to finish in the top-3.

Possibly the more interesting part of all of this is the fact that after the top two, the teams are very bunched together. The probabilities for Iowa State, Colorado, Notre Dame, BYU and Stanford are very close. Fighting through the end will be key and I wonder if something that was mentioned on the podcast could be a factor. Will the course favor track runners over those ‘mudders’ like Colorado?

This is even clearer when we look at the average team finish below.

And don’t ignore Tulsa! They actually won the whole thing 8 times out of 10,000. Sure, that’s only 0.08% of the time, but there’s a chance.

Here’s a table of each team’s AVERAGE FINISH within the simulation.

SCHOOL	AVERAGE TEAM FINISH
Northern Arizona	1.99
Oklahoma State	2.99
Iowa State	3.80
Colorado	4.33
Notre Dame	4.49
BYU	5.30
Stanford	5.82
Tulsa	7.40
Oregon	9.93
Air Force	11.86
Arkansas	12.11
Furman	12.87
Washington	13.03
Wake Forest	13.51
Wisconsin	14.79
Gonzaga	15.27
Ole Miss	15.36
Alabama	18.19
Texas	20.51
Harvard	21.20
Southern Utah	21.23
Portland	23.53
Syracuse	24.00
North Carolina	24.01
Butler	24.28
Florida State	24.76
Princeton	25.32
Georgetown	26.06
Minnesota	28.16
Michigan	29.05
Michigan State	30.85

I find it interesting to see the Big 10 Conference anchoring the bottom. If this pans out, then maybe the committee overvalued their quality. If they perform much better than the model, then I would suspect this means the Wood model possibly held those down too much.

INDIVIDUALS

Now, remember, I did not include the true individuals, running the event without their teams.

Here are the top-10 runners based on the simulation.

INDIVIDUAL	SCHOOL	1ST	2ND	3RD	4TH	5TH	6TH	7TH	8TH	9TH	10TH
Connor Mantz	BYU	4074	1696	975	715	576	498	433	335	263	192
Adiaan Wildschutt	Florida State	2272	1839	1096	834	646	551	460	442	431	339
Wesley Kiptoo	Iowa State	1840	1727	1193	826	688	593	501	480	435	401
Eduardo Herrera	Colorado	436	885	994	781	672	643	572	525	497	477
Abdihamid Nur	Northern Arizona	415	932	988	805	715	599	603	542	491	450
Nico Young	Northern Arizona	397	853	896	782	685	570	567	531	527	490
Charles Hicks	Stanford	187	505	764	716	646	579	589	538	543	513
Casey Clinger	BYU	165	496	656	724	638	660	520	516	502	504
Cooper Teare	Oregon	82	312	513	620	621	559	550	522	496	456
Ahmed Muhumed	Florida State	47	210	397	532	568	505	505	512	503	511

The race is expected to be very close and Mantz (BYU), Wildschutt (FSU) and Kiptoo (ISU) are the clear favorites, but all of the big names are there. Once you venture past those first three, it appears to be wide open.

My model actually had 18 different runners who won the title at least once (way to go, Ky Robinson!) and 21 total who grabbed second at least one time.

I hope you enjoyed this and I understand this was a long and winding road, but I enjoyed diving into this for a day or so and seeing what the numbers showed. Best of luck to all of the runners, especially the ones I know… (go BTR!)