By default powerful ML systems will have dangerous capabilities (such as hacking) and may not do what their operators want. Frontier AI labs should design and modify their systems to be less dangerous and more controllable. In particular, labs should:| ailabwatch.org
Without great security, sophisticated actors can steal AI model weights. Thieves are likely to deploy dangerous models incautiously; none of a lab’s deployment-safety matters if another actor deploys the models without those measures.| ailabwatch.org
Labs should:| ailabwatch.org
What labs should do| ailabwatch.org
Frontier AI labs should have a governance structure and processes to promote safety and help make important decisions well, especially decisions about safety practices and model training and deployment. Mostly this is an open problem that the labs should solve. Possible desiderata include:| ailabwatch.org
When a dangerous model is deployed, it will pose misalignment and misuse risks. Even before dangerous models exist, deploying models on dangerous paths can accelerate and diffuse progress toward dangerous models.| ailabwatch.org
Labs should do research to make AI systems safer, more interpretable, and more controllable, and they should publish that research.| ailabwatch.org
Labs should make a plan for aligning powerful systems they create, and they should publish it to elicit feedback, inform others’ plans and research (especially other labs and external alignment researchers who can support or complement their plan), and help them notice and respond to information when their plan needs to change. They should omit dangerous details if those exist. As their understanding of AI risk and safety techniques improves, they should update the plan. Sharing also enable...| ailabwatch.org