A quick refresher on the maths behind LLMs: vectors, matrices, projections, embeddings, logits and softmax.| Giles' Blog
Archive of Giles Thomas’s blog posts from August 2025. Insights on AI, startups, and software development, plus occasional personal reflections.| www.gilesthomas.com
How AI chatbots like ChatGPT work under the hood -- the post I wish I’d found before starting 'Build a Large Language Model (from Scratch)'.| Giles' Blog
A pause to take stock: realising that attention heads are simpler than I thought explained why we do the calculations we do.| Giles' Blog
Finally getting to the end of chapter 3 of Raschka’s LLM book! This time it’s multi-head attention: what it is, how it works, and why the code does what it does.| Giles' Blog
Batching speeds up training and inference, but for LLMs we can't just use matrices for it -- we need higher-order tensors.| Giles' Blog
Why dropout is kind of like the mandatory vacation policies financial firms have| Giles' Blog