Last year, I wrote a quick post investigating the ‘unconditioned’ distribution of LLMs in the OpenAI API, where the ‘unconditioned distribution’ is simply the distribution of LLM outputs following the empty string – or beginning of sequence token. My intuition here was that this gives some idea of what the...