Posts in the 'Musings' category on Giles Thomas’s blog. Insights on AI, startups, software development, and technical projects, drawn from 30 years of experience.| www.gilesthomas.com
Why dropout is kind of like the mandatory vacation policies financial firms have| Giles' Blog
Adding dropout to the LLM's training is pretty simple, though it does raise one interesting question| Giles' Blog
Causal, or masked self-attention: when we're considering a token, we don't pay attention to later ones. Following Sebastian Raschka's book 'Build a Large Language Model (from Scratch)'. Part 9/??| Giles' Blog
Archive of Giles Thomas’s blog posts from March 2025. Insights on AI, startups, and software development, plus occasional personal reflections.| www.gilesthomas.com
Moving on from a toy self-attention mechanism, it's time to find out how to build a real trainable one. Following Sebastian Raschka's book 'Build a Large Language Model (from Scratch)'. Part 8/??| Giles' Blog