AI progress may lead to transformative AI systems in the next decade, but we do not yet understand how to make such systems safe and aligned with human values. In response, we are pursuing a variety of research directions aimed at better understanding, evaluating, and aligning AI systems.| www.anthropic.com
By Pedro A. Ortega, Vishal Maini, and the DeepMind safety team| Medium
TL;DR: We give a threat model literature review, propose a categorization and describe a consensus threat model from some of DeepMind's AGI safety te…| www.lesswrong.com