Writing about AI, geo, culture, media, data, and the ways they interact.| Drew Breunig
What can we learn from December’s LLM blitz and o3’s arrival?| Drew Breunig
Writing about AI, geo, culture, media, data, and the ways they interact.| Drew Breunig
We’re on a journey to advance and democratize artificial intelligence through open source and open science.| huggingface.co
The BAIR Blog| The Berkeley Artificial Intelligence Research Blog
Increasingly, domain experts matter more when building great AI apps.| Drew Breunig
Earlier this year, I wrote Your AI product needs evals. Many of you asked, “How do I get started with LLM-as-a-judge?” This guide shares what I’ve learned after helping over 30 companies set up their evaluation systems. The Problem: AI Teams Are Drowning in Data Ever spend weeks building an AI system, only to realize you have no idea if it’s actually working? You’re not alone. I’ve noticed teams repeat the same mistakes when using LLMs to evaluate AI outputs: Too Many Metrics: Cre...| Hamel's Blog
How to construct domain-specific LLM evaluation systems.| hamel.dev