Developing a model plugin#| llm.datasette.io
This post is a little history lesson in the world of natural language processing. It should give a pretty practical overview of techniques (Markov Chains, BPE, GPTs) used to generate text since 1948 (Claude Shannon’s seminal paper) to today’s LLMs.| obrhubr.org
What’s the deal with the uncanny valley?| minimaxir.com
Professor Naftali Tishby passed away in 2021. Hope the post can introduce his cool idea of information bottleneck to more people. Recently I watched the talk “Information Theory in Deep Learning” by Prof Naftali Tishby and found it very interesting. He presented how to apply the information theory to study the growth and transformation of deep neural networks during training. Using the Information Bottleneck (IB) method, he proposed a new learning bound for deep neural networks (DNN), as ...| lilianweng.github.io
I’m excited to share a 3 state, 3 symbol Turing Machine that cannot be proven to halt or not (when starting on a blank tape) without solving a Collatz-like problem. Therefore, solving the \(BB(3, 3)\) problem is at least as hard as solving this Collatz-like problem, a class of problem for which Paul Erdős famously said: “Mathematics may not be ready for such problems.”| sligocki
I'm sure you agree that it has become impossible to ignore Generative AI (GenAI), as we are constantly bombarded with mainstream news about Large Language Models (LLMs). Very likely you have tried…| blog.miguelgrinberg.com
Recently I reread some chapters of the book The Practice of Programming by| thewagner.net
It seemed a bit unfair to devote a blog to machine learning (ML) without talking about its current core algorithm: stochastic gradient descent (SGD). Indeed, SGD has become, year after year, the basic foundation of many algorithms used for large-scale ML problems. However, the history of stochastic approximation is much older than that of ML: its first study by Robbins and Monro [1] dates back to 1951. Their aim was to find the zeros of a function that can only be accessed through noisy meas...| Machine Learning Research Blog
This report evaluates the likelihood of ‘explosive growth’, meaning > 30% annual growth of gross world product (GWP), occurring by 2100. Although frontier GDP/capita growth has been constant for 150 years, over the last 10,000 years GWP growth has accelerated significantly. Endogenous growth theory, together with the empirical fact of the demographic transition, can explain […]| Open Philanthropy
Today I’d like to start explaining an approach to stochastic time evolution for ‘state charts’, a common approach to agent based models. This is ultimately supposed to interact we…| Azimuth
Randomness in Haskell| jtobin.io