Setting the scene The 2025 International Mathematics Olympiad has come and gone. Reminder: this is an exam for high-school kids across the world (each country typically sends six kids), comprising of two 4.5-hour exams each containing three questions, so six … Continue reading →| Xena
A month or two ago I wrote this post which expressed my frustration with various issues around private datasets as a way of measuring the mathematical abilities of language models. More generally I was frustrated about the difficulty of being … Continue reading →| Xena
Undergraduate mathematicians usually have a hard time defining functions from quotients in Lean, because they have been taught a specific model for quotients in their classes, which is not the model that Lean uses. This post is an attempt to … Continue reading →| Xena
My feed was recently clogged up with news articles reporting that Sam Altman thinks that AGI is here, or will be here next year, or whatever. I will refrain from giving even more air to this nonsense by linking to … Continue reading →| Xena
So I'm two months into trying to teach a proof of Fermat's Last Theorem to a computer. We already have one interesting story, which I felt was worth sharing. Continue reading →| Xena
A huge amount happened in the Lean theorem prover community in 2023; this blog post looks back at some of these events, plus some of what we have to look forward to in 2024. Modern mathematics I personally am a … Continue reading →| Xena
(This is a guest post by Bhavik Mehta) On March 16, 2023, a paper by Campos, Griffiths, Morris, and Sahasrabudhe appeared on the arXiv, announcing an exponential improvement to the upper bound on Ramsey numbers, an open problem since 1935. … Continue reading →| Xena
A brief survey post containing some of the things which happened in the Lean community in 2022. The Liquid Tensor Experiment In December 2020, Fields Medallist Peter Scholze challenged the formal computer proof community to verify one of his theorems … Continue reading →| Xena
The liquid tensor experiment is now fully completed.| Xena
So the big news this week is that o3, OpenAI’s new language model, got 25% on FrontierMath. Let’s start by explaining what this means.| Xena