Introducing EmbeddingGemma Brand new open weights (under the slightly janky Gemma license) 308M parameter embedding model from Google:Based on the Gemma 3 architecture, EmbeddingGemma is trained on 100+ languages and is small enough to run on less than 200MB of RAM with quantization. It's available via sentence-transformers, llama.cpp, MLX, Ollama, LMStudio and more. As usual for these smaller models there's a Transformers.js demo (via) that runs directly in the browser (in Chrome variants) -...| Simon Willison's Weblog
Learn how the ModernBERT backbone model paves the way for more efficient and effective retrieval pipelines, and how to use ModernBERT in Vespa.| Vespa Blog
Executive Summary This blog post delves into the complexities of embedding computation for structured data, illustrating how to streamline this process using Python decorators. By addressing common pitfalls such as excessive boilerplate and tightly coupled logic, we introduce a clean, modular approach using the concept of embedders. Through practical examples, we demonstrate how decorators can simplify embedding generation, reduce human error, and improve code maintainability. This solution i...| John Azariah’s Blog
Learned embeddings often suffer from ’embedding collapse’, where they occupy only a small subspace of the available dimensions. This article explores the causes of embedding collapse, from two-tower models to GNN-based systems, and its impact on model scalability and recommendation quality. We discuss methods to detect collapse and examine recent solutions proposed by research teams at Visa, Facebook AI, and Tencent Ads to address this challenge.| Sumit's Diary
In this post, we will discuss how to build a Prompt Injection detector using a simple classification task with Scikit-learn’s Logistic Regression. Logistic Regression is a statistical method …| Shekhar Gulati
The Feature Platform for Machine Learning, from the creators of the Uber Michelangelo feature store| Tecton
Three comprehensive guides to using the Cohere Embed v3 binary embeddings with Vespa.| Vespa Blog
Embeddings – condensed, rich representations of unstructured data – have emerged as a transformative technique for unlocking the full potential of both predictive and generative AI.| Tecton
Update: Now with GPT-4 and GPT-3.5 support.| WFH Brian
How I built a post recommendation feature for my blog using text embeddings, GPT-4 and ChromaDB with LangChain| Ishan Das Sharma
Announcing long-context ColBERT, giving it larger context for scoring and simplifying long-document RAG applications.| Vespa Blog
Using the “shortening” properties of OpenAI v3 embedding models to greatly reduce latency/cost while retaining near-exact quality| Vespa Blog
Question: How does reducing the precision of vector components to various extents (half, third, quarter, fifth) using different methods (toFixed, Math.round)| WFH Brian
In recent years, there has been a significant amount of research activity in the graph representation learning domain. These learning methods help in analyzing abstract graph structures in information networks and improve the performances of state-of-the-art machine learning solutions for real-world applications, such as social recommendations, targeted advertising, user search, etc. This article provides a comprehensive introduction to the graph representation learning domain, including comm...| Sumit's Diary