Thanks to Collin Burns, Ruiqi Zhong, Cassidy Laidlaw, Jean-Stanislas Denain, and Erik Jones, who generated most of the considerations discussed in this post. Previously [https://bounded-regret.ghost.io/ai-forecasting-one-year-in/], I evaluated the accuracy of forecasts about performance on the MATH and MMLU (Massive Multitask) datasets. I argued that most people,| Bounded Regret
Thanks to Hao Zhang, Kayvon Fatahalian, and Jean-Stanislas Denain for helpful discussions and comments. Addendum and erratum. See here [https://kipp.ly/blog/transformer-inference-arithmetic/] for an excellent discussion of similar ideas by Kipply Chen. In addition, James Bradbury has pointed out to me that some of the constants in this| Bounded Regret
In 1972, the Nobel prize-winning physicist Philip Anderson wrote the essay " More Is Different [https://science.sciencemag.org/content/177/4047/393]". In it, he argues that quantitative changes can lead to qualitatively different and unexpected phenomena. While he focused on physics, one can find many examples of More is| Bounded Regret
Last August, my research group created a forecasting contest [https://bounded-regret.ghost.io/ai-forecasting/] to predict AI progress on four benchmarks. Forecasts were asked to predict state-of-the-art performance (SOTA) on each benchmark for June 30th 2022, 2023, 2024, and 2025. It’s now past June 30th, so we can evaluate| Bounded Regret