When it comes to language models, we tend to look at benchmarks to decide which model is the best to use in our application. But benchmarks only tell half a story. Unless you're building an all-purpose chat application, what you should be actually looking at is how well a model works for your application. This article is based on a benchmark I put together to evaluate which of the language models available on Ollama is best suited for Dev Proxy. You can find the working prototype on GitHub. I...