A look inside Google’s Gemini 2.5 Deep Think, the AI that uses extended "slow thinking" to solve complex math and code problems.| bdtechtalks.substack.com
Generative reward modeling uses principles and critiques to help LLMs to learn reasoning about tasks without explicit ground-truth signals| bdtechtalks.substack.com