Login
From:
www.interconnects.ai
(Uncensored)
subscribe
Do we need RL for RLHF? - by Nathan Lambert - Interconnects
https://www.interconnects.ai/p/the-dpo-debate
links
backlinks
Roast topics
Find topics
Find it!
Direct (DPO) vs. RL methods for preferences, more RLHF models, and hard truths in open RLHF work. We have more questions than answers.