Reasoning models were as big of an improvement as the Transformer, at least on some benchmarks| epochai.substack.com
I gave the opening keynote at the AI Engineer World’s Fair yesterday. I was a late addition to the schedule: OpenAI pulled out of their slot at the last minute, …| Simon Willison’s Weblog
Google has released Gemini 2.0 Flash Thinking, a direct competitor to OpenAI's o1 and a breakthrough in AI models with transparent reasoning. Compare features, benchmarks, and limitations.| Helicone.ai
If there’s one AI tool that’s growing to be a dangerous competitor to OpenAI’s infamous ChatGPT, it’s Claude.| Keywords Everywhere Blog
Anthropic's MCP could do for agents what TCP/IP did for networking.| www.understandingai.org
She said 'unprecedented' so many times I almost lost count.| ZDNET
Scaling reinforcement learning, tracing circuits, and the path to fully autonomous agents| www.dwarkesh.com
The development of AI that is more broadly capable than humans will create a new and serious threat: *AI-enabled coups*. An AI-enabled coup could be staged by a very small group, or just a single person, and could occur even in established democracies. Sufficiently advanced AI will introduce three novel dynamics that significantly increase coup risk. Firstly, military and government leaders could fully replace human personnel with AI systems that are *singularly loyal* to them, eliminating th...| Forethought
Why under-elicitation and scheming are both important to address| aligned.substack.com
[Guest post by Mike Reese, Associate Dean of the Center for Teaching Excellence and Innovation & Associate Teaching Professor of Sociology, Johns Hopkins University]| The Innovative Instructor
Posted on Friday 19 Jul 2024. 1,601 words, 29 links. By Matt Webb.| Interconnected, a blog by Matt Webb
From GitHub Copilot to ChatGPT to Claude Artifacts, how Val Town borrowed the best of all the code generation tools| blog.val.town
Most coders want AI to write code faster: I want AI to write FASTER CODE.| minimaxir.com
A lot has happened in the world of Large Language Models over the course of 2024. Here’s a review of things we figured out about the field in the past …| Simon Willison’s Weblog
Our expectations for AI are often wrong, and that can be a problem.| miraculous cake
…and how to correct it| miraculous cake
Leading labs are reportedly struggling to improve LLM performance with scaling.| www.understandingai.org
How do VLMs combine their modalities?| seantrott.substack.com
The AI Safety Institute will have access to new models before and following their releases under the new testing and evaluation pacts.| FedScoop
With ChatGPT being used colloquially to represent nearly everything AI, lesser-known Chat AI models like Claude are sometimes overlooked. Today, Anthropic has released Claude Sonnet 3.5 with some features you'll likely find not only impressive but incredibly useful.| New Atlas
Claude Pro and Team users can now organize chats into Projects. Projects bring together internal knowledge and chat activity in one place so Claude can be your go-to expert for generating ideas, making decisions, and moving work forward.| www.anthropic.com
Cursor is a code editor that helps you speed up your development process by using AI to write code for you.| Juan Stoppa
As we enter the Age of Accessible Law, a wave of new demand is coming our way — but AI will meet most of the surge. What will be left for lawyers? Just the most valuable and irreplaceable role in law.| jordanfurlong.substack.com
Like Claude Artifacts, but with a backend and database| blog.val.town
What’s the deal with the uncanny valley?| minimaxir.com
The latest attempt at an AI-powered wearable is an always-listening pendant. But it doesn’t help you be more productive, it just keeps you company.| WIRED
How do software engineers utilize GenAI tools in their software development workflow? We sidestep the hype, and look to the reality of tech professionals using LLMs for coding and other tasks.| newsletter.pragmaticengineer.com
Speculations on the role of RLHF and why I love the model for people who pay attention.| www.interconnects.ai
If you have a project, an idea, a product feature, or anything else that you want other people to understand and have conversations about... give them something to link to! …| simonwillison.net
Plus: Ilya Sutskever is back; Nvidia becomes the world's most valuable company; another company trials a humanoid robot; a military robot-dog arms race; a mad scientist grows neurons to play Doom| www.humanityredefined.com