Yoshua Bengio warns that the current approach to developing AI models carries potentially catastrophic risks.| TIME
When sensing defeat in a match against a skilled chess bot, advanced models sometimes hack their opponent, a study found.| TIME
LawZero is a nonprofit organization committed to advancing research and creating technical solutions that enable safe-by-design AI systems.| lawzero.org
The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue goals across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios ...| arXiv.org
Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - also known as scheming. We study whether models have the capability to scheme in pursuit of a goal that we provide in-context and instruct the model to strongly follow. We evaluate frontier models on a suite of six agentic evaluations where models are instructed to pursue goals and are placed in ...| arXiv.org