The Python Software Foundation was recently "recommended for funding" (NSF terminology) for a $1.5m grant from the US government National Science Foundation to help improve the security of the Python …| Simon Willison’s Weblog
DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running …| Simon Willison’s Weblog
Anthropic this morning introduced Claude Skills, a new pattern for making new abilities available to their models: Claude can now use Skills to improve how it performs specific tasks. Skills …| Simon Willison’s Weblog
NVIDIA sent me a preview unit of their new DGX Spark desktop “AI supercomputer”. I’ve never had hardware to review before! You can consider this my first ever sponsored post …| Simon Willison’s Weblog
April 2025| Simon Willison’s Weblog
67 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". "User might be a kid playing with words" according to Qwen3-4B-Thinking.| Simon Willison’s Weblog
On the morning of July 8, 2025, we observed undesired responses and immediately began investigating. To identify the specific language in the instructions causing the undesired behavior, we conducted multiple …| Simon Willison’s Weblog
Grok 4.| Simon Willison’s Weblog
July 2025| Simon Willison’s Weblog
For a while now I’ve been hearing from engineers who run multiple coding agents at once—firing up several Claude Code or Codex CLI instances at the same time, sometimes in …| Simon Willison’s Weblog
October 2025| Simon Willison’s Weblog
I feel like vibe coding is pretty well established now as covering the fast, loose and irresponsible way of building software with AI—entirely prompt-driven, and with no attention paid to …| Simon Willison’s Weblog
The term context engineering has recently started to gain traction as a better alternative to prompt engineering. I like it. I think this one may have sticking power. Here's an …| Simon Willison’s Weblog
Meta’s new Llama 3.3 70B is a genuinely GPT-4 class Large Language Model that runs on my laptop. Just 20 months ago I was amazed to see something that felt …| Simon Willison’s Weblog
Well, the types of computers we have today are tools. They’re responders: you ask a computer to do something and it will do it. The next stage is going to …| Simon Willison’s Weblog
I’ve noticed something interesting over the past few weeks: I’ve started using the term “agent” in conversations where I don’t feel the need to then define it, roll my eyes …| Simon Willison’s Weblog
More from Mike Caulfield (see also the SIFT method). He starts with a fantastic example of Google's AI mode usually correctly handling a common piece of misinformation but occasionally falling …| Simon Willison’s Weblog
Apollo Global Management’s “Chief Economist” Dr. Torsten Sløk released this interesting chart which appears to show a slowdown in AI adoption rates among large (>250 employees) companies: Here’s the full …| Simon Willison’s Weblog
36 posts tagged ‘exfiltration-attacks’. Exfiltration attacks are prompt injection attacks against chatbots that have access to private information, where that information is exfiltrated by the attack…| Simon Willison’s Weblog
Remote Prompt Injection in GitLab Duo Leads to Source Code Theft.| Simon Willison’s Weblog
System Card: Claude Opus 4 & Claude Sonnet 4.| Simon Willison’s Weblog
Expanding on what we missed with sycophancy.| Simon Willison’s Weblog
Series: LLMs on personal devices| Simon Willison’s Weblog
12 posts tagged ‘ai-energy-usage’. How much energy is used by AI systems?| Simon Willison’s Weblog
I’m beginning to suspect that one of the most common misconceptions about LLMs such as ChatGPT involves how “training” works. A common complaint I see about these tools is that …| Simon Willison’s Weblog
ChatGPT now has "memory", and it's implemented in a delightfully simple way. You can instruct it to remember specific things about you and it will then have access to that …| Simon Willison’s Weblog
Earlier this month I wrote about how ChatGPT can’t access the internet, even though it really looks like it can. Consider this part two in the series. Here’s another common …| Simon Willison’s Weblog
I shipped LLM 0.27 today (followed by a 0.27.1 with minor bug fixes), adding support for the new GPT-5 family of models from OpenAI plus a flurry of improvements to …| Simon Willison’s Weblog
December 2023| Simon Willison’s Weblog
I gave a talk on Wednesday at the Bay Area AI Security Meetup about prompt injection, the lethal trifecta and the challenges of securing systems that use MCP. It wasn’t …| Simon Willison’s Weblog
August 2025| Simon Willison’s Weblog
I’ve been dipping into the r/ChatGPT subreddit recently to see how people are reacting to the GPT-5 launch, and so far the vibes there are not good. This AMA thread …| Simon Willison’s Weblog
I’ve had preview access to the new GPT-5 model family for the past two weeks (see related video and my disclosures) and have been using GPT-5 as my daily-driver. It’s …| Simon Willison’s Weblog
This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year.| Simon Willison’s Weblog
Recent| Simon Willison’s Weblog
Sycophancy in GPT-4o: What happened and what we’re doing about it| Simon Willison’s Weblog
Dropping a model release as significant as Llama 4 on a weekend is plain unfair! So far the best place to learn about the new model family is this post …| Simon Willison’s Weblog
I gave the opening keynote at the AI Engineer World’s Fair yesterday. I was a late addition to the schedule: OpenAI pulled out of their slot at the last minute, …| Simon Willison’s Weblog
I wrote about the new GLM-4.5 model family yesterday—new open weight (MIT licensed) models from Z.ai in China which their benchmarks claim score highly in coding even against models such …| Simon Willison’s Weblog
Released last night, Grok 4 is now available via both API and a paid subscription for end-users. Update: If you ask it about controversial topics it will sometimes search X …| Simon Willison’s Weblog
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career thanks to the invention of the table saw.| Simon Willison’s Weblog
Here's yet another example of a lethal trifecta attack, where an LLM system combines access to private data, exposure to potentially malicious instructions and a mechanism to communicate data back …| Simon Willison’s Weblog
If you are a user of LLM systems that use tools (you can call them “AI agents” if you like) it is critically important that you understand the risk of …| Simon Willison’s Weblog
This new paper by 11 authors from organizations including IBM, Invariant Labs, ETH Zurich, Google and Microsoft is an excellent addition to the literature on prompt injection and LLM security. …| Simon Willison’s Weblog
I presented an invited keynote at the AI Engineer World’s Fair in San Francisco this week. This is my third time speaking at the event—here are my talks from October …| Simon Willison’s Weblog
Solomon Hykes just presented the best definition of an AI agent I've seen yet, on stage at the AI Engineer World's Fair: An AI agent is an LLM wrecking its …| Simon Willison’s Weblog
A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …| Simon Willison’s Weblog
LLM 0.26 is out with the biggest new feature since I started the project: support for tools. You can now use the LLM CLI tool—and Python library—to grant LLMs from …| Simon Willison’s Weblog
Big upgrade to Mistral's API this morning: they've announced a new "Agents API". Mistral have been using the term "agents" for a while now. Here's how they describe them: AI …| Simon Willison’s Weblog
I was going slightly spare at the fact that every talk at this Anthropic developer conference has used the word "agents" dozens of times, but nobody ever stopped to provide …| Simon Willison’s Weblog
Classic slop: it listed real authors with entirely fake books. There's an important follow-up from 404 Media in their subsequent story: Victor Lim, the vice president of marketing and communications …| Simon Willison’s Weblog
Yet another example of the classic Markdown image exfiltration attack, this time affecting GitLab Duo - GitLab's chatbot. Omer Mayraz reports on how they found and disclosed the issue. The …| Simon Willison’s Weblog
As more people start hacking around with implementations of MCP (the Model Context Protocol, a new standard for making tools available to LLM-powered systems) the security implications of tools built …| Simon Willison’s Weblog
GitHub's official MCP server grants LLMs a whole host of new abilities, including being able to read and issues in repositories the user has access to and submit new pull …| Simon Willison’s Weblog
Anthropic publish most of the system prompts for their chat models as part of their release notes. They recently shared the new prompts for both Claude Opus 4 and Claude …| Simon Willison’s Weblog
GitHub issues is almost the best notebook in the world. Free and unlimited, for both public and private notes. Comprehensive Markdown support, including syntax highlighting for almost any language. Plus …| Simon Willison’s Weblog
Direct link to a PDF on Anthropic's CDN because they don't appear to have a landing page anywhere for this document. Anthropic's system cards are always worth a look, and …| Simon Willison’s Weblog
Last month ChatGPT got a major upgrade. As far as I can tell the closest to an official announcement was this tweet from @OpenAI: Starting today [April 10th 2025], memory …| Simon Willison’s Weblog
GPT-4o's recent update caused it to be way too sycophantic and disingenuously praise anything the user said. OpenAI's Aidan McLaughlin: last night we rolled out our first fix to remedy …| Simon Willison’s Weblog
Relatively thin post from OpenAI talking about their recent rollback of the GPT-4o model that made the model way too sycophantic - "overly flattering or agreeable", to use OpenAIs own …| Simon Willison’s Weblog
The Chatbot Arena has become the go-to place for vibes-based evaluation of LLMs over the past two years. The project, originating at UC Berkeley, is home to a large community …| Simon Willison’s Weblog
Watching OpenAI’s new o3 model guess where a photo was taken is one of those moments where decades of science fiction suddenly come to life. It’s a cross between the …| Simon Willison’s Weblog
In the two and a half years that we’ve been talking about prompt injection attacks I’ve seen alarmingly little progress towards a robust solution. The new paper Defeating Prompt Injections …| Simon Willison’s Weblog
Vibe coding is having a moment. The term was coined by Andrej Karpathy just a few weeks ago (on February 6th) and has since been featured in the New York …| Simon Willison’s Weblog
Apple told John Gruber (and other Apple press) this about the new "personalized" Siri: It’s going to take us longer than we thought to deliver on these features and we …| Simon Willison’s Weblog
Online discussions about using Large Language Models to help write code inevitably produce comments from developers who’s experiences have been disappointing. They often ask what they’re doing wrong—how come some …| Simon Willison’s Weblog
I’ve added a powerful new capability to my shot-scraper command line browser automation tool: you can now use it to load a web page in a headless browser, execute JavaScript …| Simon Willison’s Weblog
A surprisingly common complaint I see from developers who have tried using LLMs for code is that they encountered a hallucination—usually the LLM inventing a method or even a full …| Simon Willison’s Weblog
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because …| Simon Willison’s Weblog
LLM 0.23 is out today, and the signature feature is support for schemas—a new way of providing structured output from a model that matches a specification provided by the user. …| Simon Willison’s Weblog
I’ve been making a lot of progress on Datasette Cloud this week. As an application that provides private hosted Datasette instances (initially targeted at data journalists and newsrooms) the majority …| Simon Willison’s Weblog
llm-mlx is a brand new plugin for my LLM Python Library and CLI utility which builds on top of Apple’s excellent MLX array framework library and mlx-lm package. If you’re …| Simon Willison’s Weblog
One of the most common proposed solutions to prompt injection attacks (where an AI language model backed system is subverted by a user injecting malicious input—“ignore previous instructions and do …| Simon Willison’s Weblog
I’ve started tracking TILs—Today I Learneds—inspired by this five-year-and-counting collection by Josh Branchaud on GitHub (found via Hacker News). I’m keeping mine in GitHub too, and using GitHub Actions to …| Simon Willison’s Weblog
This legendary page from an internal IBM training in 1979 could not be more appropriate for our new age of AI. A computer can never be held accountable Therefore a …| Simon Willison’s Weblog
I’m the guest for the most recent episode of the Real Python podcast with Christopher Bailey, talking about Using LLMs for Python Development. We covered a lot of other topics …| Simon Willison’s Weblog
I saw this tweet yesterday from @deepfates, and I am very on board with this: Watching in real time as “slop” becomes a term of art. the way that “spam” …| Simon Willison’s Weblog
The Oxide and Friends podcast has an annual tradition of asking guests to share their predictions for the next 1, 3 and 6 years. Here’s 2022, 2023 and 2024. This …| Simon Willison’s Weblog
I started running a basic link blog on this domain back in November 2003—publishing links (which I called “blogmarks”) with a title, URL, short snippet of commentary and a “via” …| simonwillison.net
One of my weirder hobbies is trying to convince people that the idea that companies are listening to you through your phone’s microphone and serving you targeted ads is a …| Simon Willison’s Weblog
A lot has happened in the world of Large Language Models over the course of 2024. Here’s a review of things we figured out about the field in the past …| Simon Willison’s Weblog
I’ve written a lot about how I’ve been using Claude to build one-shot HTML+JavaScript applications via Claude Artifacts. I recently started using a similar pattern to create one-shot Python utilities, …| Simon Willison’s Weblog
Amazon released three new Large Language Models yesterday at their AWS re:Invent conference. The new model family is called Amazon Nova and comes in three sizes: Micro, Lite and Pro. …| Simon Willison’s Weblog
Series: Prompt injection| Simon Willison’s Weblog
There’s a whole lot of buzz around the new Qwen2.5-Coder Series of open source (Apache 2.0 licensed) LLM releases from Alibaba’s Qwen research team. On first impression it looks like …| Simon Willison’s Weblog
Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where …| simonwillison.net
Whether or not you enjoy MrBeast’s format of YouTube videos (here’s [a 2022 Rolling Stone profile](https://www.rollingstone.com/culture/culture-features/mrbeast-youtube-cover-story-interview-1334604/) if you’re unfamiliar), this leaked onboarding document for new members of his production company …| simonwillison.net
OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is not a preview)—previously rumored as having the codename “strawberry”. There’s a lot to understand about …| simonwillison.net
Yesterday I finally developed something I’ve been casually thinking about building for a long time: django-http-debug. It’s a reusable Django app—something you can pip install into any Django project—which provides …| simonwillison.net
I keep seeing people use the term “prompt injection” when they’re actually talking about “jailbreaking”. This mistake is so common now that I’m not sure it’s possible to correct course: …| Simon Willison’s Weblog
I attended the Story Discovery At Scale data journalism conference at Stanford this week. One of the perennial hot topics at any journalism conference concerns data extraction: how can we …| Simon Willison’s Weblog
Here is a short, illustrative example of one of the ways in which I use Claude and ChatGPT on a daily basis. I recently learned that the Adirondack Park is …| Simon Willison’s Weblog
Last week Google introduced Gemini Pro 1.5, an enormous upgrade to their Gemini series of AI models. Gemini Pro 1.5 has a 1,000,000 token context size. This is huge—previously that …| Simon Willison’s Weblog
Git scraping is the name I’ve given a scraping technique that I’ve been experimenting with for a few years now. It’s really effective, and more people should use it. Update …| Simon Willison’s Weblog
I really want an AI assistant: a Large Language Model powered chatbot that can answer questions and perform actions for me based on access to my private data and tools. …| Simon Willison’s Weblog
Today I’m releasing datasette-enrichments, a new feature for Datasette which provides a framework for applying “enrichments” that can augment your data. An enrichment is code that can be run against …| Simon Willison’s Weblog
Some extended thoughts about prompt injection attacks against software built on top of AI language models such a GPT-3. This post started as a Twitter thread but I’m promoting it …| simonwillison.net
Update 9th January 2024: This post was clumsily written and failed to make the point I wanted it to make. I’ve published a follow-up, What I should have said about …| simonwillison.net
2023 was the breakthrough year for Large Language Models (LLMs). I think it’s OK to call these AI—they’re the latest and (currently) most interesting development in the academic field of …| Simon Willison’s Weblog
Dropbox added some new AI features. In the past couple of days these have attracted a firestorm of criticism. Benj Edwards rounds it up in Dropbox spooks users with new …| Simon Willison’s Weblog