Roast topics
Find topics
Roast it!
Roast topics
Find topics
Find it!
Login
From:
www.lesswrong.com
(Uncensored)
subscribe
Gradient hacking — LessWrong
https://www.lesswrong.com/posts/uXH4r6MmKPedk8rMA/gradient-hacking
links
backlinks
Gradient hacking is when a deceptively aligned AI deliberately acts to influence how the training process updates it. For example, it might try to be…