Generative language models trained to predict what comes next have been shown to be a very useful foundation for models that can perform a wide variety of traditionally difficult language tasks. Perplexity is the standard measure of how well such a model can predict the next word on a given text, and it’s very closely related to cross-entropy and bits-per-byte. It’s a measure of how effective the language model is on the text, and in certain settings aligns with how well the model perform...