Roast topics
Find topics
Find it!
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
1 Introduction
| arxiv.org