Double descent is a puzzling phenomenon in machine learning where increasing model size/training time/data can initially hurt performance, but then i…| www.alignmentforum.org