Login
From:
www.lesswrong.com
(Uncensored)
subscribe
Backdoors as an analogy for deceptive alignment — LessWrong
https://www.lesswrong.com/posts/efwcZ35LwS6HgFcN8/backdoors-as-an-analogy-for-deceptive-alignment
links
backlinks
Roast topics
Find topics
Find it!
ARC has released a paper on Backdoor defense, learnability and obfuscation in which we study a formal notion of backdoors in ML models. Part of our m…