Roast topics
Find topics
Roast it!
Roast topics
Find topics
Find it!
Login
From:
Evan Miller’s News
(Uncensored)
subscribe
Attention Is Off By One
https://www.evanmiller.org/attention-is-off-by-one.html
links
backlinks
Transformer has a mathematical bug that has been overlooked for 6+ years. I propose fixing its outliers with two new devices, Softmax One and QuietAttention: Attention Is Off By One