Login
From:
Simon Willison’s Weblog
(Uncensored)
subscribe
How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM
https://simonwillison.net/2025/May/31/snitchbench-with-llm/
links
backlinks
Tagged with:
ai
openai
llm
llms
anthropic
claude
deepseek
llm-tool-use
A fun new benchmark just dropped! Inspired by the Claude 4 system card—which showed that Claude 4 might just rat you out to the authorities if you told it to …
Roast topics
Find topics
Find it!