IBM Research and partners have released Bamba-9B-v2, an open-source hybrid Transformer-SSM model trained on 3T tokens, claiming faster inference than comparable LLMs.| WinBuzzer
DFloat11 offers lossless ~30% size reduction for BF16 LLMs and enabling much longer context lengths on GPUs.| WinBuzzer
Sakana AI has unveiled a memory management solution for Transformers that saves resources, handles long contexts, and transfers seamlessly across tasks.| WinBuzzer
Meta has launched compact Llama models to enhance mobile AI, offering efficient AI processing on smartphones and small devices.| WinBuzzer
Alibaba’s ZeroSearch trains large language models to beat Google Search and slash API costs by 88%, redefining how AI learns to retrieve information.| VentureBeat
SOCIAL MEDIA DESCRIPTION TAG TAG| alibaba-nlp.github.io
Large Language Models (LLMs) – Overview and Latest News| WinBuzzer
Alibaba launches Qwen 2.5-Max, a new AI model challenging DeepSeek V3 in key performance benchmarks, with OpenAI API compatibility and Alibaba Cloud access.| WinBuzzer