Data

Share of FrontierMath problems solved correctly by AI models

See all data and research on:

What you should know about this indicator

This indicator shows the share of FrontierMath problems that AI models solve correctly, based on Epoch AI's evaluation.
FrontierMath is a set of 350 original math problems written by experts, covering many areas of advanced mathematics. Many problems are difficult enough that human specialists might need hours or days to solve them.
The benchmark has four difficulty tiers. This indicator shows accuracy on Tiers 1–3 (300 problems). Tier 4 contains 50 exceptionally difficult problems and is not included here.
Scoring is all-or-nothing: models get 1 point for a correct final answer and 0 for anything else, with no partial credit. Models submit their answers as Python code and can use Python while working on problems. This means scores reflect math ability with access to computational tools, not just pen-and-paper reasoning.
Only 12 problems are publicly available, mainly so researchers can inspect how evaluations work, not to report scores.
FrontierMath was developed by Epoch AI with funding from OpenAI, whose GPT models are among those evaluated on this benchmark. OpenAI has exclusive access to a subset of the problems.

How is this data described by its producer?

FrontierMath is a benchmark of hundreds of original, exceptionally challenging mathematics problems crafted and vetted by expert mathematicians. The questions cover most major branches of modern mathematics – from computationally intensive problems in number theory and real analysis to abstract questions in algebraic geometry and category theory. Solving a typical problem requires multiple hours of effort from a researcher in the relevant branch of mathematics, and for the upper end questions, multiple days.

The full FrontierMath dataset contains 350 problems. This is split into a base set of 300 problems, which we call Tiers 1-3, and an expansion set of 50 exceptionally difficult problems, which we call Tier 4. We have made 10 problems from Tiers 1-3 public, calling this frontiermath-2025-02-28-public. The remaining 290 problems make up frontiermath-2025-02-28-private. Similarly, we have made 2 problems from Tier 4 public, calling this frontiermath-tier-4-2025-07-01-public, while the remaining 48 problems make up frontiermath-tier-4-2025-07-01-private. Unless explicitly mentioned otherwise, all the numbers on this hub correspond to evaluations on the private sets. You can find more information about the public problems here.

FrontierMath was developed with funding from OpenAI, who has exclusive access to a subset of the benchmark.

Share of FrontierMath problems solved correctly by AI models

FrontierMath benchmark evaluates models on 300 difficult, research-level problems in advanced mathematics (Tiers 1–3), which can take expert mathematicians hours or days to work through.

Source

Epoch AI (2026) – with minor processing by Our World in Data

Last updated

January 30, 2026

Next expected update

May 2026

Unit

More Data on Artificial Intelligence

Sources and processing

This data is based on the following sources

Epoch AI – Epoch AI Benchmark Data

Comprehensive collection of AI benchmark datasets from Epoch AI, including FrontierMath and other performance benchmarks.

Retrieved on

March 8, 2026

Retrieved from

https://epoch.ai/benchmarks

Citation

This is the citation of the original data obtained from the source, prior to any processing or adaptation by Our World in Data. To cite data downloaded from this page, please use the suggested citation given in Reuse This Work below.

Epoch AI, ‘AI Benchmarking Hub’. Published online at epoch.ai. Retrieved from ‘https://epoch.ai/benchmarks’ [online resource]. Accessed 30 Jan 2026.

Comprehensive collection of AI benchmark datasets from Epoch AI, including FrontierMath and other performance benchmarks.

Retrieved on

March 8, 2026

Retrieved from

https://epoch.ai/benchmarks

Citation

Epoch AI, ‘AI Benchmarking Hub’. Published online at epoch.ai. Retrieved from ‘https://epoch.ai/benchmarks’ [online resource]. Accessed 30 Jan 2026.

How we process data at Our World in Data

All data and visualizations on Our World in Data rely on data sourced from one or several original data providers. Preparing this original data involves several processing steps. Depending on the data, this can include standardizing country names and world region definitions, converting units, calculating derived indicators such as per capita measures, as well as adding or adapting metadata such as the name or the description given to an indicator.

At the link below you can find a detailed description of the structure of our data pipeline, including links to all the code used to prepare data across Our World in Data.

Read about our data pipeline

Reuse this work

All data produced by third-party providers and made available by Our World in Data are subject to the license terms from the original providers. Our work would not be possible without the data providers we rely on, so we ask you to always cite them appropriately (see below). This is crucial to allow data providers to continue doing their work, enhancing, maintaining and updating valuable data.
All data, visualizations, and code produced by Our World in Data are completely open access under the Creative Commons BY license. You have the permission to use, distribute, and reproduce these in any medium, provided the source and authors are credited.

Citations

How to cite this page

To cite this page overall, including any descriptions, FAQs or explanations of the data authored by Our World in Data, please use the following citation:

“Data Page: Share of FrontierMath problems solved correctly by AI models”, part of the following publication: Charlie Giattino, Edouard Mathieu, Veronika Samborska, and Max Roser (2023) - “Artificial Intelligence”. Data adapted from Epoch AI. Retrieved from https://archive.ourworldindata.org/20260310-113828/grapher/ai-frontiermath-over-time.html [online resource] (archived on March 10, 2026).

How to cite this data

In-line citationIf you have limited space (e.g. in data visualizations), you can use this abbreviated in-line citation:

Epoch AI (2026) – with minor processing by Our World in Data

Full citation

Epoch AI (2026) – with minor processing by Our World in Data. “Share of FrontierMath problems solved correctly by AI models” [dataset]. Epoch AI, “Epoch AI Benchmark Data” [original data]. Retrieved April 1, 2026 from https://archive.ourworldindata.org/20260310-113828/grapher/ai-frontiermath-over-time.html (archived on March 10, 2026).

Download

Quick download

Download the data shown in this chart as a ZIP file containing a CSV file, metadata in JSON format, and a README. The CSV file can be opened in Excel, Google Sheets, and other data analysis tools.

Download full data

Includes all entities and time points

Download displayed data

Includes only the entities and time points currently visible in the chart

Data API

Use these URLs to programmatically access this chart's data and configure your requests with the options below. Our documentation provides more information on how to use the API, and you can find a few code examples below.

Download full data, including all entities and time points

Download only the currently selected data visible in the chart

Data URL (CSV format)

https://ourworldindata.org/grapher/ai-frontiermath-over-time.csv?v=1&csvType=full&useColumnShortNames=false

Metadata URL (JSON format)

https://ourworldindata.org/grapher/ai-frontiermath-over-time.metadata.json?v=1&csvType=full&useColumnShortNames=false

Code examples

Examples of how to load this data into different data analysis tools.

Excel / Google Sheets

=IMPORTDATA("https://ourworldindata.org/grapher/ai-frontiermath-over-time.csv?v=1&csvType=full&useColumnShortNames=false")

Python with Pandas

import pandas as pd
import requests

# Fetch the data.
df = pd.read_csv("https://ourworldindata.org/grapher/ai-frontiermath-over-time.csv?v=1&csvType=full&useColumnShortNames=false", storage_options = {'User-Agent': 'Our World In Data data fetch/1.0'})

# Fetch the metadata
metadata = requests.get("https://ourworldindata.org/grapher/ai-frontiermath-over-time.metadata.json?v=1&csvType=full&useColumnShortNames=false").json()

R

library(jsonlite)

# Fetch the data
df <- read.csv("https://ourworldindata.org/grapher/ai-frontiermath-over-time.csv?v=1&csvType=full&useColumnShortNames=false")

# Fetch the metadata
metadata <- fromJSON("https://ourworldindata.org/grapher/ai-frontiermath-over-time.metadata.json?v=1&csvType=full&useColumnShortNames=false")

Stata

import delimited "https://ourworldindata.org/grapher/ai-frontiermath-over-time.csv?v=1&csvType=full&useColumnShortNames=false", encoding("utf-8") clear

Share of FrontierMath problems solved correctly by AI models

What you should know about this indicator

How is this data described by its producer?

More Data on Artificial Intelligence

Sources and processing

This data is based on the following sources

Epoch AI – Epoch AI Benchmark Data

How we process data at Our World in Data

Reuse this work

Citations

How to cite this page

How to cite this data

Download

Quick download

Download full data

Download displayed data

Data API

Data URL (CSV format)

Metadata URL (JSON format)

Code examples

Excel / Google Sheets

Python with Pandas

R

Stata

Our World in Data is free and accessible for everyone.