Investigating the seahorse emoji doom loop using logitlens.| vgel.me
A write up of work extending and building on the paper Emergent World Representations| Neel Nanda
Introduction| Neel Nanda
Building and evaluating an open-source pipeline for auto-interpretability| EleutherAI Blog
What we've been up to for the past year EleutherAI.| EleutherAI Blog