Clémentine Fourier of HuggingFace on why you should stop using LLMs as Judges, what comes after MMLU, how prompts formatting sways benchmark results, and why leaderboards are GPU poor| www.latent.space
How we can use AI for as a "partner in thought", losing faith in long context windows for improved reasoning, and why we should stop anthropomorphizing LLMs| www.latent.space