A month or two ago I wrote this post which expressed my frustration with various issues around private datasets as a way of measuring the mathematical abilities of language models. More generally I was frustrated about the difficulty of being … Continue reading →