Week 4 of the AI alignment curriculum. Scalable oversight refers to methods that enable humans to oversee AI systems that are solving tasks too complicated for a single human to evaluate. Basically divide and conquer. AI alignment landscape (Christiano, 2020) Intent alignment -> getting AIs to want to do what you want them to do Paul isn’t focused on reliability (how often it makes mistakes**, hopes it’ll get better along with capabilities Well meaning !