To mentally prepare for an upcoming trail running race I scraped data from the largest trail running organization, the UTMB, and published the data & code on Kaggle. In this post I describe the source, present the ”Race Finder” application and present some insights.
Trail running races are much like regular road running races except that they are off-road by definition, and generally longer than your typical road race. The premier global trail running race is the Ultra-Trail du Mont-Blanc, held in Chamonix-Mont-Blanc, France. It is a 174 km (108 mi) race with 10,000 m (33,000 ft) of elevation gain that is won in about 20 hours. This race is organized by the UTMB Group. In addition to the UTMB, they organize 40+ prominent races across Asia, Oceania, Europe, Africa, and the Americas called the “UTMB World Series”. Their website UTMB.world contains the results of these events. But to my surprise, it also contains the results of thousands of other trail running races. Since their website displays results using simple pagination, and not state-based systems, it was easy to scrape data from the site using Beautiful Soup.
Have a look at the data scraped through the UTMB Race Finder below or in full screen. This tool can help you manage expectations by reviewing historic data or help you find a race to participate in.
Trail running races vary a lot more than road running races. This makes estimating the challenge of a race more difficult. Factors you will have to consider in addition to the distance include the elevation gain, terrain type, and climate. Setting the right expectations can substantially improve race preparation and race outcome.
For example, this data would have helped me in my first UTMB race, the X-Traversée. The X-Traversée starts at 8 am, and I was hoping to finish the 76km race before sunset, in about 12 hours, a reasonable 6km/hour, right? This data, however, would have told me that this would have put me in the top 4% of runners, pretty ambitious for a first-timer from a flat country. The data would have told me that finishers take 17 hours on average. The unsuspected climb and finish in the dark were mental blows that this data could have prevented
Finding the right UTMB race involves navigating a vast array of options, as the UTMB Race Finder includes data from 19,894 unique races worldwide. With such variety, it’s essential to consider more than just distance—factors like elevation gain, DNF rate, continent, country, and race category (50K, 100K, or 100M) are critical to making the right choice. The UTMB World Series offers over 40 flagship races, but the Race Finder helps you explore thousands more, allowing you to filter by key criteria to match your fitness level and experience.
Before diving into the detailed data and tools I’ve developed, let’s take a quick look at some of the most interesting facts I uncovered during my analysis. From the steepest inclines to the races with the longest distances, these highlights offer a glimpse into the wide range of challenges the UTMB races present. Whether you’re a data enthusiast or a trail runner planning your next adventure, these quick facts will provide a snapshot of the diversity and scale of trail races worldwide.
Category | Race name | Amount | Link |
---|---|---|---|
Steepest average incline | Vertikal K3 Bei 2019 | 30.6% | 🔗 |
Longest distance | Great Himal Race 2017 | 1,355 km | 🔗 |
Shortest distance | Amangeldy Race 2023 | 6 km | 🔗 |
Most elevation gain | Great Himal Race 2017 | 80,230 m | 🔗 |
Longest mean finish time | Great Himal Race 2017 | 415.3 hours | 🔗 |
Shortest mean finish time | GIIR DI MONT 2022 | 0.9 hours | 🔗 |
Highest DNF Rate | Mad Fox Ultra 2019 | 89.3% | 🔗 |
Largest portion of female participants | QUEEN of the JUNGLE 2017 | 85.7% | 🔗 |
Most participants | La SaintéLyon 2017 | 6,740 | 🔗 |
Longest race time recorded | Great Himal Race 2017 | 500.0 hours | 🔗 |
Most international race | UTMB® Mont Blanc 2022 | 83 countries | 🔗 |
Exclusion criteria: < 10 participants, < 10 finishers, uncategorized races, female exclusive races, < 1km race distance, mean finish time of 0 hours, last finish time < 10 * mean finish time
The source file to this cleaned file is 187MB. It contains the individual finish times required to generate the histograms shown in the UTMB Race Finder, and the frequency of countries of origin. The source data file is not publicly available due to its size. Please get in touch if you are interested in using this file.
To reduce file size, I aggregated individual finish times to “Mean Finish Time”, “Winning Time”, “Last Time”, and country of origin to “N Countries”. This reduces the file to 3.7MB that you can download here utmb_sheet.csv. In addition, I have pubished this data on I have published the data on Kaggle. Here is some important information about this file containing race meta data:
I have attached the code that I used to scrape this data as notebook to the Kaggle data set: Scripts used in UTMB data collection
Trail running presents unique challenges, and having access to historical race data can significantly enhance both race preparation and race selection. By scraping and analyzing data from UTMB.world, I’ve created tools like the Race Finder to help runners navigate the vast array of races, understand their difficulty, and set realistic goals. Whether you’re preparing for a specific event or just exploring the trail running world, the insights shared here can help you manage expectations and make informed decisions.
The data is available for further exploration on Kaggle, where you’ll also find the code used to gather this information. If you’re interested in delving deeper into the dataset or have questions, feel free to reach out. I hope this resource helps fellow trail runners in their race journey—good luck, and happy trails!