October 2025| Simon Willison’s Weblog
I feel like vibe coding is pretty well established now as covering the fast, loose and irresponsible way of building software with AI—entirely prompt-driven, and with no attention paid to …| Simon Willison’s Weblog
The term context engineering has recently started to gain traction as a better alternative to prompt engineering. I like it. I think this one may have sticking power. Here's an …| Simon Willison’s Weblog
Meta’s new Llama 3.3 70B is a genuinely GPT-4 class Large Language Model that runs on my laptop. Just 20 months ago I was amazed to see something that felt …| Simon Willison’s Weblog
The SIFT method is "an evaluation strategy developed by digital literacy expert, Mike Caulfield, to help determine whether online content can be trusted for credible or reliable sources of information." …| Simon Willison’s Weblog
When I wrote about how good ChatGPT with GPT-5 is at search yesterday I nearly added a note about how comparatively disappointing Google's efforts around this are. I'm glad I …| Simon Willison’s Weblog
Sunday, 7th September 2025| Simon Willison’s Weblog
Well, the types of computers we have today are tools. They’re responders: you ask a computer to do something and it will do it. The next stage is going to …| Simon Willison’s Weblog
I’ve noticed something interesting over the past few weeks: I’ve started using the term “agent” in conversations where I don’t feel the need to then define it, roll my eyes …| Simon Willison’s Weblog
More from Mike Caulfield (see also the SIFT method). He starts with a fantastic example of Google's AI mode usually correctly handling a common piece of misinformation but occasionally falling …| Simon Willison’s Weblog
I’m experimenting with using MySQL full text indexing to generate a list of “related entries” for each entry (click on an item’s permalink to see it in action). It works …| Simon Willison’s Weblog
September 2025| Simon Willison’s Weblog
Apollo Global Management’s “Chief Economist” Dr. Torsten Sløk released this interesting chart which appears to show a slowdown in AI adoption rates among large (>250 employees) companies: Here’s the full …| Simon Willison’s Weblog
36 posts tagged ‘exfiltration-attacks’. Exfiltration attacks are prompt injection attacks against chatbots that have access to private information, where that information is exfiltrated by the attack…| Simon Willison’s Weblog
Remote Prompt Injection in GitLab Duo Leads to Source Code Theft.| Simon Willison’s Weblog
System Card: Claude Opus 4 & Claude Sonnet 4.| Simon Willison’s Weblog
Expanding on what we missed with sycophancy.| Simon Willison’s Weblog
Series: LLMs on personal devices| Simon Willison’s Weblog
12 posts tagged ‘ai-energy-usage’. How much energy is used by AI systems?| Simon Willison’s Weblog
I’m beginning to suspect that one of the most common misconceptions about LLMs such as ChatGPT involves how “training” works. A common complaint I see about these tools is that …| Simon Willison’s Weblog
ChatGPT now has "memory", and it's implemented in a delightfully simple way. You can instruct it to remember specific things about you and it will then have access to that …| Simon Willison’s Weblog
Earlier this month I wrote about how ChatGPT can’t access the internet, even though it really looks like it can. Consider this part two in the series. Here’s another common …| Simon Willison’s Weblog
I shipped LLM 0.27 today (followed by a 0.27.1 with minor bug fixes), adding support for the new GPT-5 family of models from OpenAI plus a flurry of improvements to …| Simon Willison’s Weblog
December 2023| Simon Willison’s Weblog
I gave a talk on Wednesday at the Bay Area AI Security Meetup about prompt injection, the lethal trifecta and the challenges of securing systems that use MCP. It wasn’t …| Simon Willison’s Weblog
August 2025| Simon Willison’s Weblog
I’ve been dipping into the r/ChatGPT subreddit recently to see how people are reacting to the GPT-5 launch, and so far the vibes there are not good. This AMA thread …| Simon Willison’s Weblog
I’ve had preview access to the new GPT-5 model family for the past two weeks (see related video and my disclosures) and have been using GPT-5 as my daily-driver. It’s …| Simon Willison’s Weblog
This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year.| Simon Willison’s Weblog
Recent| Simon Willison’s Weblog
Sycophancy in GPT-4o: What happened and what we’re doing about it| Simon Willison’s Weblog
Dropping a model release as significant as Llama 4 on a weekend is plain unfair! So far the best place to learn about the new model family is this post …| Simon Willison’s Weblog
I gave the opening keynote at the AI Engineer World’s Fair yesterday. I was a late addition to the schedule: OpenAI pulled out of their slot at the last minute, …| Simon Willison’s Weblog
October 2020| Simon Willison’s Weblog
I wrote about the new GLM-4.5 model family yesterday—new open weight (MIT licensed) models from Z.ai in China which their benchmarks claim score highly in coding even against models such …| Simon Willison’s Weblog
For the past two and a half years the feature I’ve most wanted from LLMs is the ability to take on search-based research tasks on my behalf. We saw the …| Simon Willison’s Weblog
31 posts tagged ‘vibe-coding’. As defined here - not the same thing as AI-assisted programming, though there's some overlap.| Simon Willison’s Weblog
In further evidence that phishing attacks can catch out the most sophisticated among us, security researcher (and operator of ';--have i been pwned?) Troy Hunt reports on how he fell …| Simon Willison’s Weblog
April 2025| Simon Willison’s Weblog
If you ask the new Grok 4 for opinions on controversial questions, it will sometimes run a search to find out Elon Musk’s stance before providing you with an answer. …| Simon Willison’s Weblog
Released last night, Grok 4 is now available via both API and a paid subscription for end-users. Update: If you ask it about controversial topics it will sometimes search X …| Simon Willison’s Weblog
Max Woolf pointed out this new feature of the Gemini 2.5 series (here’s my coverage of 2.5 Pro and 2.5 Flash) in a comment on Hacker News: One hidden note …| Simon Willison’s Weblog
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.| Simon Willison’s Weblog
Here's yet another example of a lethal trifecta attack, where an LLM system combines access to private data, exposure to potentially malicious instructions and a mechanism to communicate data back …| Simon Willison’s Weblog
If you are a user of LLM systems that use tools (you can call them “AI agents” if you like) it is critically important that you understand the risk of …| Simon Willison’s Weblog
120 posts tagged ‘prompt-injection’. Prompt Injection is a security attack against applications built on top of Large Language Models, introduced here and further described in this series of posts.| Simon Willison’s Weblog
This new paper by 11 authors from organizations including IBM, Invariant Labs, ETH Zurich, Google and Microsoft is an excellent addition to the literature on prompt injection and LLM security. …| Simon Willison’s Weblog
I presented an invited keynote at the AI Engineer World’s Fair in San Francisco this week. This is my third time speaking at the event—here are my talks from October …| Simon Willison’s Weblog
Solomon Hykes just presented the best definition of an AI agent I've seen yet, on stage at the AI Engineer World's Fair: An AI agent is an LLM wrecking its …| Simon Willison’s Weblog
A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …| Simon Willison’s Weblog
LLM 0.26 is out with the biggest new feature since I started the project: support for tools. You can now use the LLM CLI tool—and Python library—to grant LLMs from …| Simon Willison’s Weblog
Big upgrade to Mistral's API this morning: they've announced a new "Agents API". Mistral have been using the term "agents" for a while now. Here's how they describe them: AI …| Simon Willison’s Weblog
I was going slightly spare at the fact that every talk at this Anthropic developer conference has used the word "agents" dozens of times, but nobody ever stopped to provide …| Simon Willison’s Weblog
Classic slop: it listed real authors with entirely fake books. There's an important follow-up from 404 Media in their subsequent story: Victor Lim, the vice president of marketing and communications …| Simon Willison’s Weblog
Yet another example of the classic Markdown image exfiltration attack, this time affecting GitLab Duo - GitLab's chatbot. Omer Mayraz reports on how they found and disclosed the issue. The …| Simon Willison’s Weblog
As more people start hacking around with implementations of MCP (the Model Context Protocol, a new standard for making tools available to LLM-powered systems) the security implications of tools built …| Simon Willison’s Weblog
GitHub's official MCP server grants LLMs a whole host of new abilities, including being able to read and issues in repositories the user has access to and submit new pull …| Simon Willison’s Weblog
Anthropic publish most of the system prompts for their chat models as part of their release notes. They recently shared the new prompts for both Claude Opus 4 and Claude …| Simon Willison’s Weblog
GitHub issues is almost the best notebook in the world. Free and unlimited, for both public and private notes. Comprehensive Markdown support, including syntax highlighting for almost any language. Plus …| Simon Willison’s Weblog
Direct link to a PDF on Anthropic's CDN because they don't appear to have a landing page anywhere for this document. Anthropic's system cards are always worth a look, and …| Simon Willison’s Weblog
Last month ChatGPT got a major upgrade. As far as I can tell the closest to an official announcement was this tweet from @OpenAI: Starting today [April 10th 2025], memory …| Simon Willison’s Weblog
GPT-4o's recent update caused it to be way too sycophantic and disingenuously praise anything the user said. OpenAI's Aidan McLaughlin: last night we rolled out our first fix to remedy …| Simon Willison’s Weblog
Relatively thin post from OpenAI talking about their recent rollback of the GPT-4o model that made the model way too sycophantic - "overly flattering or agreeable", to use OpenAIs own …| Simon Willison’s Weblog
The Chatbot Arena has become the go-to place for vibes-based evaluation of LLMs over the past two years. The project, originating at UC Berkeley, is home to a large community …| Simon Willison’s Weblog
Watching OpenAI’s new o3 model guess where a photo was taken is one of those moments where decades of science fiction suddenly come to life. It’s a cross between the …| Simon Willison’s Weblog
In the two and a half years that we’ve been talking about prompt injection attacks I’ve seen alarmingly little progress towards a robust solution. The new paper Defeating Prompt Injections …| Simon Willison’s Weblog
Vibe coding is having a moment. The term was coined by Andrej Karpathy just a few weeks ago (on February 6th) and has since been featured in the New York …| Simon Willison’s Weblog
Apple told John Gruber (and other Apple press) this about the new "personalized" Siri: It’s going to take us longer than we thought to deliver on these features and we …| Simon Willison’s Weblog
Online discussions about using Large Language Models to help write code inevitably produce comments from developers who’s experiences have been disappointing. They often ask what they’re doing wrong—how come some …| Simon Willison’s Weblog
I’ve added a powerful new capability to my shot-scraper command line browser automation tool: you can now use it to load a web page in a headless browser, execute JavaScript …| Simon Willison’s Weblog
A surprisingly common complaint I see from developers who have tried using LLMs for code is that they encountered a hallucination—usually the LLM inventing a method or even a full …| Simon Willison’s Weblog
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because …| Simon Willison’s Weblog
LLM 0.23 is out today, and the signature feature is support for schemas—a new way of providing structured output from a model that matches a specification provided by the user. …| Simon Willison’s Weblog
I’ve been making a lot of progress on Datasette Cloud this week. As an application that provides private hosted Datasette instances (initially targeted at data journalists and newsrooms) the majority …| Simon Willison’s Weblog
llm-mlx is a brand new plugin for my LLM Python Library and CLI utility which builds on top of Apple’s excellent MLX array framework library and mlx-lm package. If you’re …| Simon Willison’s Weblog
One of the most common proposed solutions to prompt injection attacks (where an AI language model backed system is subverted by a user injecting malicious input—“ignore previous instructions and do …| Simon Willison’s Weblog
I’ve started tracking TILs—Today I Learneds—inspired by this five-year-and-counting collection by Josh Branchaud on GitHub (found via Hacker News). I’m keeping mine in GitHub too, and using GitHub Actions to …| Simon Willison’s Weblog
This legendary page from an internal IBM training in 1979 could not be more appropriate for our new age of AI. A computer can never be held accountable Therefore a …| Simon Willison’s Weblog
I’m the guest for the most recent episode of the Real Python podcast with Christopher Bailey, talking about Using LLMs for Python Development. We covered a lot of other topics …| Simon Willison’s Weblog
I saw this tweet yesterday from @deepfates, and I am very on board with this: Watching in real time as “slop” becomes a term of art. the way that “spam” …| Simon Willison’s Weblog
The Oxide and Friends podcast has an annual tradition of asking guests to share their predictions for the next 1, 3 and 6 years. Here’s 2022, 2023 and 2024. This …| Simon Willison’s Weblog
I started running a basic link blog on this domain back in November 2003—publishing links (which I called “blogmarks”) with a title, URL, short snippet of commentary and a “via” …| simonwillison.net
One of my weirder hobbies is trying to convince people that the idea that companies are listening to you through your phone’s microphone and serving you targeted ads is a …| Simon Willison’s Weblog
A lot has happened in the world of Large Language Models over the course of 2024. Here’s a review of things we figured out about the field in the past …| Simon Willison’s Weblog
I’ve written a lot about how I’ve been using Claude to build one-shot HTML+JavaScript applications via Claude Artifacts. I recently started using a similar pattern to create one-shot Python utilities, …| Simon Willison’s Weblog
Amazon released three new Large Language Models yesterday at their AWS re:Invent conference. The new model family is called Amazon Nova and comes in three sizes: Micro, Lite and Pro. …| Simon Willison’s Weblog
Series: Prompt injection| Simon Willison’s Weblog
There’s a whole lot of buzz around the new Qwen2.5-Coder Series of open source (Apache 2.0 licensed) LLM releases from Alibaba’s Qwen research team. On first impression it looks like …| Simon Willison’s Weblog
Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where …| simonwillison.net
Whether or not you enjoy MrBeast’s format of YouTube videos (here’s [a 2022 Rolling Stone profile](https://www.rollingstone.com/culture/culture-features/mrbeast-youtube-cover-story-interview-1334604/) if you’re unfamiliar), this leaked onboarding document for new members of his production company …| simonwillison.net
OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is not a preview)—previously rumored as having the codename “strawberry”. There’s a lot to understand about …| simonwillison.net
Yesterday I finally developed something I’ve been casually thinking about building for a long time: django-http-debug. It’s a reusable Django app—something you can pip install into any Django project—which provides …| simonwillison.net
If you have a project, an idea, a product feature, or anything else that you want other people to understand and have conversations about... give them something to link to! …| simonwillison.net
I keep seeing people use the term “prompt injection” when they’re actually talking about “jailbreaking”. This mistake is so common now that I’m not sure it’s possible to correct course: …| Simon Willison’s Weblog
I attended the Story Discovery At Scale data journalism conference at Stanford this week. One of the perennial hot topics at any journalism conference concerns data extraction: how can we …| Simon Willison’s Weblog
Here is a short, illustrative example of one of the ways in which I use Claude and ChatGPT on a daily basis. I recently learned that the Adirondack Park is …| Simon Willison’s Weblog
Last week Google introduced Gemini Pro 1.5, an enormous upgrade to their Gemini series of AI models. Gemini Pro 1.5 has a 1,000,000 token context size. This is huge—previously that …| Simon Willison’s Weblog
Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update …| Simon Willison’s Weblog
I really want an AI assistant: a Large Language Model powered chatbot that can answer questions and perform actions for me based on access to my private data and tools. …| Simon Willison’s Weblog
Today I’m releasing datasette-enrichments, a new feature for Datasette which provides a framework for applying “enrichments” that can augment your data. An enrichment is code that can be run against …| Simon Willison’s Weblog