Login
From:
www.alignmentforum.org
(Uncensored)
subscribe
The Self-Hating Attention Head: A Deep Dive in GPT-2 — AI Alignment Forum
https://www.alignmentforum.org/posts/wxPvdBwWeaneAsWRB/the-self-hating-attention-head-a-deep-dive-in-gpt-2-1
links
backlinks
Roast topics
Find topics
Find it!
gpt2-small's head L1H5 directs attention to semantically similar tokens and actively suppresses self-attention. The head computes attention…