Figure 1 I want a theory that predicts which features deep nets learn, when they learn them, and why. But neural nets are messy and hard to analyse, so we need to find some way of simplifying them for analysis which still recovers the properties we care about. Deep linear networks (DLNs) are one attempt at that: the models that keep depth, nonconvexity, and hierarchical representation formation while remaining analytically tractable. In principle, they let me connect data geometry (singular ...