Roast topics
Find topics
Find it!
Roast topics
Find topics
Roast it!
Measuring AI Ability to Complete Long Tasks - METR
Analysis code available on GitHub
| metr.org