canyon289 20 days ago | next [–]| news.ycombinator.com
Standard retrieval can only get you so far. Alignment, contextual retrieval, and reranking can improve your RAG pipeline considerably.| TechTalks - Technology solving problems... and creating new ones
Large language models (LLM) require huge memory and computational resources. LLM compression techniques make models more compact and executable on memory-constrained devices.| TechTalks - Technology solving problems... and creating new ones
Explore Gemma 3 270M, a compact, energy-efficient AI model for task-specific fine-tuning, offering strong instruction-following and production-ready quantization.| developers.googleblog.com
Explore Gemma 3 models now offering state-of-the-art AI performance on consumer GPUs with new int4 quantized versions optimized with Quantization Aware Training (QAT).| developers.googleblog.com