Myself and Pratik recently recorded a wide ranging discussion on his work leading the Tasks/Procedures workstream. Pratik’s workstream has really been at the cutting edge of using LLMs to take actions within business, and all the… The post Podcast EP2: Shipping reliable AI actions appeared first on /research.| /research
I recently got to sit down with Fedor Parfenov to discuss his work leading the AI workstream building our Insights product. We discussed the purpose and rational behind building the Insights product; Fedor’s application of Causal…| /research
Conventional wisdom says speed matters in software and that fast is always better. But in testing our AI agent, we found that slowing down might actually make it feel smarter.| /research
We built our own reranker that outperforms Cohere Rerank v3.5, an industry-leading commercial solution. This improved our answer quality, reduced reranking costs by 80%, and gained more flexibility to evolve our system. The post How We Built a World-Class Reranker for Fin appeared first on /research.| /research
Good answers start with good context. Our AI agents use retrieval-augmented generation (RAG) to find the right context for a user’s query. RAG retrieves top passages from a knowledge base, then uses them to generate an… The post Using LLMs as a Reranker for RAG: A Practical Guide appeared first on /research.| /research
At Intercom, we’ve built Fin, an AI-powered support bot designed to understand users’ issues and answer their questions accurately. To do this, Fin relies on state-of-the-art large language models (LLMs). However, even the most advanced LLMs… The post Finetuning Retrieval for Fin appeared first on /research.| /research
Are smaller fine-tuned LLMs competent for Intercom scale tasks? Large Language Models (LLMs) are a powerful tech that have turned reasoning in natural language, into a service. They’ve had a huge impact on customer support, powering… The post David vs Goliath: are small LLMs any good? appeared first on /research.| /research
We'll discuss how we built tools to make our scientists productive at training modern AI models. Ones that require bigger, more expensive, harder to get GPUs, often running on the bleeding edge of the software stack. The post Building out Intercom’s AI infra appeared first on /research.| /research
Introduction Fin’s north start metric is resolution rate; it’s how we measure how well Fin, our customer support AI agent, is performing. Each resolution is priced at US$0.99, so accurately detecting when Fin resolves a conversation… The post “Was that helpful?” Understanding User Feedback in Customer Support AI Agents appeared first on /research.| /research
One of Fin AI Agent’s most critical tasks is deciding when to escalate customer interactions to human support. This challenge has only grown as Fin has become more conversational, and now most escalations happen through natural… The post To escalate, or not to escalate, that is the question appeared first on /research.| /research
When users ask Fin some question, they expect Fin to respond in the same language in which they ask the question. Detecting the language a user is speaking is a key step in the Fin pipeline.… The post Building a Better Language Detection Model for Fin appeared first on /research.| /research
TLDR: we explored AWS hardware options (and serving engines), and it turned out that a self-serving LLM can be significantly more cost-effective than commercial APIs. Introduction Fin is an advanced customer support AI agent powered by…| /research
When we think of making progress on the intelligence front in AI, we typically think of producing models that have higher “peak intelligence”. But in the Customer Support space, and many other similar applications, it is not higher peak intelligence that we most need in order to build generally able AI systems anymore. Instead, “intelligence density” is how we can efficiently drive immediate and meaningful progress in real-world, latency-constrained AI applications, today.| /research
Building reliable large language model (LLM) inference is still an emerging discipline. Although the field has matured considerably in recent years, we are far from the level of dependability seen in industry-standard services such as Amazon…| /research
We step through the optimisation process required to make an Open Source reasoning model fast enough to use as a component of an interactive user application.| /research
Using customer service conversations as a source of truth is a very tempting idea but the sheer volume of the conversations may flood AI Specialists. By experimenting with various architectures, we have found a solution that…| /research
We experiment with the strategy of developing composable AI agents with slightly tempered autonomy. The resulting agent exhibits vastly improved reliability, and performance.| /research