Writing about AI, geo, culture, media, data, and the ways they interact.| Drew Breunig
How do you add advertising to a tool used to make decisions?| Drew Breunig
How does the probabilistic nature of AI change the nature of computer engineering?| Drew Breunig
AI job titles are a confusing mess of mix-and-match terms. This decoder ring breaks down the patterns behind titles.| Drew Breunig
AI-assisted engineers can code 2-5x faster, but product management work hasn't accelerated at the same pace. Rather than wait for traditional PM processes, organizations are empowering hybrid engineer-PMs, FDEs, to build directly with customers.| Drew Breunig
OpenAI released its open-weight model, gpt-oss, today. It comes in two sizes, 120B and 20B, the latter of which runs briskly on my Mac Studio. I’m sure I’ll have more impressions as I use it in anger over the next few weeks, but here’s my initial thoughts:| Drew Breunig
To improve AI’s qualitative skills, we’ll build opinionated models. Aesthetic choices, not fuzzy averages, will be chosen and optimized for. FLUX.1-Krea is the first of many.| Drew Breunig
Our last post on Kimi K2 dives into how the Moonshot team used reinforcement learning (RL) on qualitative tasks. If you haven’t already, check out the last two explorations:| Drew Breunig
The Moonshot AI team synthesized thousands of tools, agents, users, and sessions to build a library of training data.| Drew Breunig
Two weeks ago, Beijing-based Moonshot AI launched Kimi K2, an open source model that rivaled the coding capabilities of larger, closed models. It’s a really impressive model (though it’s coding capabilities have since been overshadowed by Qwen 3 Coder), especially since it’s cheaper to run than Claude 3.5 Haiku.| Drew Breunig
Kenneth Cukier and Eleanor Warnock recently launched a newsletter about AI’s affect on writing and writers, Chief Word Officer. Their recent issue is an interview with comedy writer Madeleine Brettingham is fascinating; read the whole thing.| Drew Breunig
Context engineering isn’t just another buzzword, but the emergence of a new field, community, and culture. One that will grow dramatically over the coming years. Collective realizations like these are rare in our careers. And we get to help shape this one, together.| Drew Breunig
Recently, “the bitter lesson” is having a moment. Coined in an essay by Rich Sutton, the bitter lesson is that, “general methods that leverage computation are ultimately the most effective, and by a large margin.” Why is the lesson bitter? Sutton writes:| Drew Breunig
Forget the benchmarks – the best way to track AI's capabilities is to watch which decisions experts delegate to AI.| Drew Breunig
The Red Queen Scenarios in Sales & HR Are Fueled by AI Apps| Drew Breunig
| Drew Breunig
Adding unreleated text to questions can cause LLMs to consistently stumble.| Drew Breunig
6 tactics for fixing your context and shipping better agents. As Karpathy says, building LLM-powered apps means learning to ‘pack the context windows just right’—smartly deploying tools, managing information, and maintaining context hygiene.| Drew Breunig
Prompts are disposable requests written for chatbots. Contexts are evolving instructions curated for use in applications.| Drew Breunig
Taking care of your context is the key to building successful agents. Just because there's a 1 million token context window doesn't mean you should fill it.| Drew Breunig
Google's warts-and-all details about Gemini playing Pokémon demonstrate the messy reality of building effective agents.| Drew Breunig
Notes from a talk I delivered at the 2025 Data + AI Summit, detailing the problem with prompts in your code and how DSPy can make everything better.| Drew Breunig
Reviewing the few changes in the Claude 4.0 system prompts, we get a sense for how system prompts program chatbot applications.| Drew Breunig
Understand your Subsumption Window: the time between a product’s launch and the moment when a future model can replicate the product’s core functionality, out of the box.| Drew Breunig
Yesterday, after playing with some smaller models, I started to experiment with the idea of a flowchart for determining a model’s ancestry with a few prompts. For example, could you ask it about state-censored topics and about its development and figure out what model was it trained by or from. Luckily I aborted that effort, because Sam Paech, who maintains EQ-Bench, has built an entire “slop forensics” pipeline.| Drew Breunig
Your gender, ethnicity, and fandom can invisibly influence your chatbot interactions.| Drew Breunig
A couple days ago, Ásgeir Thor Johnson convinced Claude to give up its system prompt. The prompt is a good reminder that chatbots are more than just their model. They’re tools and instructions that accrue and are honed, through user feedback and design.| Drew Breunig
Two papers, taken together, point towards an AI future powered by networks of fast, cheap, diverse, local intelligences.| Drew Breunig
Increasingly, domain experts matter more when building great AI apps.| Drew Breunig
Model Context Protocols are suddenly everywhere. But what are they, in simple terms?| Drew Breunig
Open, small LLMs have gotten good enough that non-technical people can benefit from having a local chatbot on their machine.| Drew Breunig
A well-built custom eval lets you quickly test the newest models, iterate faster when developing prompts and pipelines, and ensure you’re always moving forward against your product’s specific goal. Let’s build an example eval – made from Jeopardy questions – to illustrate the value of a custom eval.| Drew Breunig
Yesterday on the Datasette Discord, Simon teased a new version of llm with multimodal capabilities. With one tiny command line tool, you can throw images at GPT-4o, Llama, Claude, and Gemini and ask for interpretations or details.| Drew Breunig
Overture’s Global Entity Reference System (GERS) could revolutionize geospatial data by standardizing ‘place’, making geospatial intelligence more accessible through simple data joins rather than complex mapping tools.| Drew Breunig
The hype is so loud we can’t appreciate the magic| Drew Breunig
An Empty Textbox Limits the Addressable Market| Drew Breunig
The early-adopter audience for chatbots is large and embraced LLMs quickly. But to grow beyond this market, AI companies must develop UX more approachable than an empty text field.| Drew Breunig
Let’s overcome decision fatigue by building a decision tree app from thousands of images of bathroom fixtures, an off-the-shelf image embedding model, and a few command-line tools.| Drew Breunig
What should we do if LLMs aren’t compatible with privacy legislation?| Drew Breunig