Notes from a talk I delivered at the 2025 Data + AI Summit, detailing the problem with prompts in your code and how DSPy can make everything better.| Drew Breunig
I presented a three hour workshop at PyCon US yesterday titled Building software on top of Large Language Models. The goal of the workshop was to give participants everything they …| Simon Willison’s Weblog
A well-built custom eval lets you quickly test the newest models, iterate faster when developing prompts and pipelines, and ensure you’re always moving forward against your product’s specific goal. Let’s build an example eval – made from Jeopardy questions – to illustrate the value of a custom eval.| Drew Breunig
What I’ve seen work and what doesn’t.| Answer.AI
The quality and development speed of AI applications is often limited by high-quality evaluation datasets and metrics, which enable you to both optimize and test your applications.| docs.smith.langchain.com
How to use your data warehouse's built-in features to simplify and potentially improve your RAG pipeline.| Rainforest QA Blog | Software Testing Guides
I summarise the kinds of evaluations that are needed for a structured data generation task.| mlops.systems
To hear directly from the authors on this topic, sign up for the upcoming virtual event on June 20th, and learn more from the Generative AI Success Stories Superstream on June 12th.| O’Reilly Media