When using a fuzzy, intuitive approach, it’s easy to gloss-over issues by imagining that a corrigible AGI will behave like a helpful, human servant. By using a sharper, more mathematical frame, we can more precisely investigate where corrigibility may have problems, such as by testing whether a purely corrigible agent behaves nicely in toy-settings.