This article explains key concepts that come up in the context of AI alignment. These terms are only attempts at gesturing at the underlying ideas, and the ideas are what is important. There is no strict consensus on which name should correspond to which idea, and different people use the terms differently.[[1]] This article explains […]| BlueDot Impact
This is an update on the work on AI Safety via Debate that we previously wrote about here. …| www.alignmentforum.org
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.| www.anthropic.com
A high-level view on how this new approach fits into our alignment plans| aligned.substack.com
Some arguments in favor and responses to common objections| aligned.substack.com