In this episode, Beyang sits down with Camden to discuss how the Amp team evaluates new models: why tool calling is the key differentiator, how open models like K2 and Qwen stack up, what GPT-5 changes, and how qualitative “vibe checks” often matter more than benchmarks. They also dive into subagents, model alloys, and what the future of agentic coding looks like inside Amp.