A couple of lessons learned in moving from Fabric3 to Fabric| Giles' Blog
Posts in the 'TIL deep dives' category on Giles Thomas’s blog. Insights on AI, startups, software development, and technical projects, drawn from 30 years of experience.| www.gilesthomas.com
Having worked through chapter 3 of Sebastian Raschka's book "Build a Large Language Model (from Scratch)", and spent some time digesting the concepts it introduced (most recently in my post on the complexity of self-attention at scale), it's time for chapter 4. I've read it through in its entirety, and rather than working through it section-by-section in order, like I did with the last one, I think I'm going to jump around a bit, covering each new concept and how I wrapped my head around it s...| Giles' blog
On sabbatical / created @PythonAnywhere.com, which found a home at @anacondainc.bsky.social / XP / Python / PSF Fellow / opinions my own / blog at https://www.gilesthomas.com| Bluesky Social
Finally getting to the end of chapter 3 of Raschka’s LLM book! This time it’s multi-head attention: what it is, how it works, and why the code does what it does.| Giles' Blog
Posts in the 'LLM from scratch' category on Giles Thomas’s blog. Insights on AI, startups, software development, and technical projects, drawn from 30 years of experience.| www.gilesthomas.com
Posts in the 'AI' category on Giles Thomas’s blog. Insights on AI, startups, software development, and technical projects, drawn from 30 years of experience.| www.gilesthomas.com
Adding dropout to the LLM's training is pretty simple, though it does raise one interesting question| Giles' Blog
Causal, or masked self-attention: when we're considering a token, we don't pay attention to later ones. Following Sebastian Raschka's book 'Build a Large Language Model (from Scratch)'. Part 9/??| Giles' Blog
Moving on from a toy self-attention mechanism, it's time to find out how to build a real trainable one. Following Sebastian Raschka's book 'Build a Large Language Model (from Scratch)'. Part 8/??| Giles' Blog
Boost your learning: Test Yourself PDF Learn how to create, train, and tweak large language models (LLMs) by building one from the ground up! In Build a Large Language Model (from Scratch) bestselling author Sebastian Raschka guides you step by step through creating your own LLM. Each stage is explained with clear text, diagrams, and examples. You’ll go from the initial design and creation, to pretraining on a general corpus, and on to fine-tuning for specific tasks. Build a Large Language ...| Manning Publications