What to make of the statements of the AI labs?| www.oneusefulthing.org
A step change as influential as the release of GPT-4. Reasoning language models are the current big thing.| www.interconnects.ai
Learn more about the only AI benchmark that measures AGI progress.| ARC Prize
Clémentine Fourier of HuggingFace on why you should stop using LLMs as Judges, what comes after MMLU, how prompts formatting sways benchmark results, and why leaderboards are GPU poor| www.latent.space
Scaling will run out. The question is when.| www.aisnakeoil.com
You can just draw more samples| redwoodresearch.substack.com