Double descent is a puzzling phenomenon in machine learning where increasing model size/training time/data can initially hurt performance, but then i…| www.alignmentforum.org
ARC explores the challenge of extracting information from AI systems that isn't directly observable in their outputs, i.e "eliciting latent knowledge…| www.alignmentforum.org
This is a sequence version of the paper “Risks from Learned Optimization in Advanced Machine Learning Systems” by Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant. Each post in the sequence corresponds to a different section of the paper. Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, and Joar Skalse contributed equally to this sequence. The goal of this sequence is to analyze the type of learned optimization that occurs when a learned model (such...| www.alignmentforum.org