To communicate risks, we often turn to stories. Nuclear weapons conjure stories of mutually assured destruction, briefcases with red buttons, and nuclear winter. Climate change conjures stories of extreme weather, cities overtaken by rising sea levels, and crop failures. Pandemics require little imagination after COVID, but were previously the subject| Bounded Regret
Earlier this year, my research group commissioned 6 questions [https://prod.hypermind.com/ngdp/en/showcase2/showcase.html?sc=JSAI] for professional forecasters to predict about AI. Broadly speaking, 2 were on geopolitical aspects of AI and 4 were on future capabilities: * Geopolitical: * How much larger or smaller will the| Bounded Regret
I previously discussed the capabilities we might expect from future AI systems, illustrated through GPT2030, a hypothetical successor of GPT-4 trained in 2030. GPT2030 had a number of advanced capabilities, including superhuman programming, hacking, and persuasion skills, the ability to think more quickly than humans and to learn quickly by| Bounded Regret
Given their advanced capabilities, future AI systems could pose significant risks to society. Some of this risk stems from humans using AI systems for bad ends (misuse), while some stems from the difficulty of controlling AI systems “even if we wanted to” (misalignment). We can analogize both of these with| Bounded Regret
I'm experimenting with hosting guest posts on this blog, as a way to represent additional viewpoints and especially to highlight ideas from researchers who do not already have a platform. Hosting a post does not mean that I agree with all of its arguments, but it does mean that I| Bounded Regret
Two years ago, I commissioned forecasts for state-of-the-art performance on several popular ML benchmarks. Forecasters were asked to predict state-of-the-art performance on June 30th of 2022, 2023, 2024, and 2025. While there were four benchmarks total, the two most notable were MATH (a dataset of free-response math contest problems) and| Bounded Regret
GPT-4 surprised many people with its abilities at coding, creative brainstorming, letter-writing, and other skills. How can we be less surprised by developments in machine learning? In this post, I’ll forecast the properties of large pretrained ML systems in 2030.| Bounded Regret
I’ve previously argued that machine learning systems often exhibit emergent capabilities, and that these capabilities could lead to unintended negative consequences. But how can we reason concretely about these consequences?| Bounded Regret
Thanks to Collin Burns, Ruiqi Zhong, Cassidy Laidlaw, Jean-Stanislas Denain, and Erik Jones, who generated most of the considerations discussed in this post. Previously [https://bounded-regret.ghost.io/ai-forecasting-one-year-in/], I evaluated the accuracy of forecasts about performance on the MATH and MMLU (Massive Multitask) datasets. I argued that most people,| Bounded Regret
Thanks to Hao Zhang, Kayvon Fatahalian, and Jean-Stanislas Denain for helpful discussions and comments. Addendum and erratum. See here [https://kipp.ly/blog/transformer-inference-arithmetic/] for an excellent discussion of similar ideas by Kipply Chen. In addition, James Bradbury has pointed out to me that some of the constants in this| Bounded Regret
My students and collaborators have been doing some particularly awesome work over the past several months, and to highlight that I wanted to summarize their papers here, and explain why I’m excited about them. There’s six papers in three categories. Human-Aligned AI * The Effects of Reward Misspecification: Mapping| Bounded Regret
In 1972, the Nobel prize-winning physicist Philip Anderson wrote the essay " More Is Different [https://science.sciencemag.org/content/177/4047/393]". In it, he argues that quantitative changes can lead to qualitatively different and unexpected phenomena. While he focused on physics, one can find many examples of More is| Bounded Regret
Last August, my research group created a forecasting contest [https://bounded-regret.ghost.io/ai-forecasting/] to predict AI progress on four benchmarks. Forecasts were asked to predict state-of-the-art performance (SOTA) on each benchmark for June 30th 2022, 2023, 2024, and 2025. It’s now past June 30th, so we can evaluate| Bounded Regret