Turning models into products runs into five challenges| AI Snake Oil
Making sense of recent technology trends and claims| www.aisnakeoil.com
🔍 o1-preview-level performance on AIME & MATH benchmarks.| api-docs.deepseek.com
We investigate four constraints to scaling AI training: power, chip manufacturing, data, and latency. We predict 2e29 FLOP runs will be feasible by 2030.| Epoch AI
Scaling will run out. The question is when.| www.aisnakeoil.com
Trying to make an AI model that can’t be misused is like trying to make a computer that can’t be used for bad things| www.aisnakeoil.com
ArXiv | Models | Data | Code | Blog | Sample Explorer Today we release Llemma: 7 billion and 34 billion parameter language models for mathematics. The Llemma models were initialized with Code Llama weights, then trained on the Proof-Pile II, a 55 billion token dataset of mathematical and scientific documents. The resulting models show improved mathematical capabilities, and can be adapted to various tasks through prompting or additional fine-tuning.| EleutherAI Blog