Topic: Achieving 10,000x training data reduction with high-fidelity labels