We release OLMoASR, a family of open automatic speech recognition (ASR) models trained from scratch on a curated, large-scale dataset.| allenai.org
We announce Asta, our bold initiative to accelerate science through trustworthy, truly open agentic AI.| Ai2 Blog
Introducing AstaBench, a novel AI agents evaluation framework and scientific research benchmark suite.| allenai.org
We find that two simple metrics, signal and noise, reveal key differences in the utility of current LLM benchmarks.| Ai2 Blog
Introducing MoNaCo, a benchmark of highly challenging questions spanning dozens of documents for evaluating large language models.| allenai.org
Tülu 3 is a leading instruction following model family, offering fully open-source data, code, and recipes.| allenai.org