Risks from Learned Optimization in Advanced ML Systems Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant This paper is available on arXiv, the AI Alignment Forum, and LessWrong. Abstract: We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer—a... Read more »| Machine Intelligence Research Institute