The past posts on optimization scaling laws [1, 2] focused on problems that do not become significantly harder as the problem size increases. We showed that for some problems, as the dimension \(d\) goes to infinity, the optimality gap converges at a sublinear rate \(\Theta(k^{-p})\) for some power \(p\) depending on the problem, but independent of \(d\). But not all problems have this nice limiting behavior, and some become harder as the problem size increases.| Machine Learning Research Blog
There is a new book published by Oxford University Press that is a travesty of Christian apologetics undeservedly snaking the respectability of, well, Oxford University: T.C. Schmidt’s Josephus and Jesus: New Evidence for the One Called Christ. The rhetorical objective of this book is to convince people that the Testimonium Flavianum, a fawning paragraph about […]| Richard Carrier Blogs
Information on the Tranco list with ID 664NX| tranco-list.eu
A dive into open chat| wiki.alopex.li
Personal website for some random tidbits I work on| maknee.github.io
Improvements made to triple the query performance of lexical search in Vespa.| Vespa Blog
Human vocabulary comes in free text. In order to make a machine learning model understand and process the natural language, we need to transform the free-text words into numeric values. One of the simplest transformation approaches is to do a one-hot encoding in which each distinct word stands for one dimension of the resulting vector and a binary value indicates whether the word presents (1) or not (0). However, one-hot encoding is impractical computationally when dealing with the entire voc...| lilianweng.github.io