TLDR A hidden Google model named Kingfall appeared for a short time. Early users say it thinks deeply and feels like a huge upgrade. Its leak hints that Gemini 2.5 Pro is close. The roundup also covers an OpenAI movie, big bonuses to keep researchers, new ChatGPT features, and moves by Anthropic and the Pentagon. […]| NATURAL 20
TLDR Top language models were thrown into the board game Diplomacy and forced to negotiate, ally, and betray. OpenAI’s 03 won by secretly forming coalitions and then knifing its friends. Gemini 2.5 Pro fought well but fell to a coordinated backstab. Claude tried to stay honest and paid the price. The open-source benchmark reveals which […]| NATURAL 20
TLDRMIT researchers have developed “self-adapting language models” (SEAL) that can improve their own abilities by generating their own training data and updating their internal parameters. This allows models to better learn from new information, adapt to tasks on the fly, and move closer to becoming long-term autonomous AI agents. It’s a major step toward models […]| NATURAL 20
TLDR Apple released a study claiming that large language models only look like they can reason. The paper says they do fine on easy questions, do better when they “think” on medium ones, but fall apart on hard puzzles. A YouTuber walks through the study, shows its flaws, and argues the models simply skip impossible […]| NATURAL 20