Public discussions about catastrophic risks from general AI systems are often derailed by using the word “intelligence”. People often have different definitions of intelligence, or associate it with concepts like consciousness that are not relevant to AI risks, or dismiss the risks because intelligence is not well-defined. I would advocate for using the term “capabilities” […]| Victoria Krakovna
AI alignment work is usually considered “longtermist”, which is about preserving humanity’s long-term potential. This was the primary motivation for this work when the alignment field got started around 20 years ago, and general AI seemed far away or impossible to most people in AI. However, given the current rate of progress towards advanced AI […]| Victoria Krakovna
(Coauthored with others on the alignment team and cross-posted from the alignment forum: part 1, part 2) A sharp left turn (SLT) is a possible rapid increase in AI system capabilities (such as planning and world modeling) that could result in alignment methods no longer working. This post aims to make the sharp left turn scenario […]| Victoria Krakovna
(This post is based on an overview talk I gave at UCL EA and Oxford AI society (recording here). Cross-posted to the Alignment Forum. Thanks to Janos Kramar for detailed feedback on this post and t…| Victoria Krakovna
[PROLOGUE – EVERYBODY WANTS A ROCK][PART I – THERMOSTAT][PART II – MOTIVATION][PART III – PERSONALITY AND INDIVIDUAL DIFFERENCES][PART IV – LEARNING][PART V – DEPRESSION AND OTHER DIAGNOSES][PART V…| SLIME MOLD TIME MOLD
How can we design an AI that will be highly capable and will not harm humans? In my opinion, we need to figure out this question - of controlling AI so that it behaves in really safe ways - before we reach human-level AI, aka AGI; and to be successful, we need all hands on deck.| Yoshua Bengio
Last year, a major focus of my research was developing a better understanding of threat models for AI risk. This post is looking back at some posts on threat models I (co)wrote in 2022 (based on my…| Victoria Krakovna
This post discusses how rogue AIs could potentially arise, in order to stimulate thinking and investment in both technical research and societal reforms aimed at minimizing such catastrophic outcomes.| Yoshua Bengio