Login
From:
www.lesswrong.com
(Uncensored)
subscribe
Gradient hacking — LessWrong
https://www.lesswrong.com/posts/uXH4r6MmKPedk8rMA/gradient-hacking
links
backlinks
Roast topics
Find topics
Find it!
Gradient hacking is when a deceptively aligned AI deliberately acts to influence how the training process updates it. For example, it might try to be…