Appearance | dynalist.io
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction| umap-learn.readthedocs.io
This post relates an observation I've made in my work with GPT-2, which I have not seen made elsewhere. …| www.lesswrong.com
Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese Watch: MIT’s Deep Learning State of the Art lecture referencing this post Featured in courses at Stanford, Harvard, MIT, Princeton, CMU and others Update: This post has now become a book! Check out LLM-book.com which contains (C...| jalammar.github.io
We report the existence of multimodal neurons in artificial neural networks, similar to those found in the human brain.| Distill
Mechanistic interpretability seeks to reverse engineer neural networks, similar to how one might reverse engineer a compiled binary computer program. After all, neural network parameters are in some sense a binary computer program which runs on one of the exotic virtual machines we call a neural network architecture.| transformer-circuits.pub
Let's use sinusoidal functions to inject the order of words in our model| kazemnejad.com
Rotary Positional Embedding (RoPE) is a new type of position encoding that unifies absolute and relative approaches. We put it to the test.| EleutherAI Blog