The past posts on optimization scaling laws [1, 2] focused on problems that do not become significantly harder as the problem size increases. We showed that for some problems, as the dimension \(d\) goes to infinity, the optimality gap converges at a sublinear rate \(\Theta(k^{-p})\) for some power \(p\) depending on the problem, but independent of \(d\). But not all problems have this nice limiting behavior, and some become harder as the problem size increases.| Machine Learning Research Blog
Revisiting scaling laws via the z-transform| francisbach.com
In the last few years, we have seen a surge of empirical and theoretical works about “scaling laws”, whose goals are to characterize the performance of learning methods based on various problem parameters (e.g., number of observations and parameters, or amount of compute). From a theoretical point of view, this marks a renewed interest in asymptotic equivalents—something the machine learning community had mostly moved away from (and, let’s be honest, kind of looked down on) in favor o...| Machine Learning Research Blog