Research agents and increasingly general reasoning models open the door for immense "evaluation data leverage".| dataleverage.substack.com