“Can an LLM toss an unbiased coin?” came up in a converstion with a friend recently. On the one hand, it shouldn’t be hard for an LLM to understand the concept of “unbiased” and output equal logits over the tokens for “Heads” and “Tails”. On the other hand, the training distribution is almost certainly not equally split between the two. I checked GPT-4o, Gemini 1.5 Pro and Claude 3.5 Sonnet.