How learning from human feedback revolutionized generative language models...| cameronrwolfe.substack.com
Making alignment via RLHF more scalable by automating human feedback...| cameronrwolfe.substack.com