A small model, a reward model, and a search algorithm can beat an LLM that is 100+ times larger.| bdtechtalks.substack.com