YouTube link| AXRP - the AI X-risk Research Podcast
Using interpretations of SAE latents to do inference on a language model.| EleutherAI Blog
Building and evaluating an open-source pipeline for auto-interpretability| EleutherAI Blog
This post represents my personal hot takes, not the opinions of my team or employer. This is a massively updated version of a similar list I made two…| www.alignmentforum.org