Generative language models trained to predict what comes next have been shown to be a very useful foundation for models that can perform a wide variety of traditionally difficult language tasks. Perplexity is the standard measure of how well such a model can predict the next word on a given text, and it’s very closely related to cross-entropy and bits-per-byte. It’s a measure of how effective the language model is on the text, and in certain settings aligns with how well the model perform...| skeptric
Weights and biases| skeptric.com
Walking through the model| skeptric.com
Makemore Subreddits - Part 2 Multilayer Perceptron| skeptric.com
Loading the Data| skeptric.com