Mon 17 July 2017| blog.keras.io
On GPT-3: meta-learning, scaling, implications, and deep theory. The scaling hypothesis: neural nets absorb data & compute, generalizing and becoming more Bayesian as problems get harder, manifesting new abilities even at trivial-by-global-standards-scale. The deep learning revolution has begun as foretold.| gwern.net
Large language models like ChatGPT have recently spooked a great many, and my Twitter feed is full of worriers saying how irresponsible orgs have been to make and release such models.| www.overcomingbias.com