Human values are functions of latent variables in our minds. But those variables may not correspond to anything in the real world. How can an AI opti…| www.alignmentforum.org
Nate Soares argues that one of the core problems with AI alignment is that an AI system's capabilities will likely generalize to new domains much fas…| www.alignmentforum.org
AI researchers warn that advanced machine learning systems may develop their own internal goals that don't match what we intended. This "mesa-optimiz…| www.alignmentforum.org
This is the second of five posts in the Risks from Learned Optimization Sequence based on the paper “Risks from Learned Optimization in Advanced Mach…| www.alignmentforum.org
johnswentworth's profile on LessWrong — A community blog devoted to refining the art of rationality| www.lesswrong.com