A highly opinionated list of what mechanistic interpretability papers to read when getting into the field| Neel Nanda
YouTube link| AXRP - the AI X-risk Research Podcast
Using interpretations of SAE latents to do inference on a language model.| EleutherAI Blog
Building and evaluating an open-source pipeline for auto-interpretability| EleutherAI Blog