Gratitude to https://tensorwave.com/ for giving me access to their excellent servers! Few have tried this and fewer have succeeded. I've been marginally successful after a significant amount of effort, so it deserves a blog post. Know that you are in...| Cognitive Computations
I enjoyed chapter 7 on finetuning. It jams a lot of detail into the 50 pages she takes to explain things. Some areas had more detail than you’d expect, and others less, but overall this was a solid summary / review. Core Narrative: Fine-tuning represents a significant technical and organisational investment that should be approached as a last resort, not a first solution. The chapter’s essential message can be distilled into three key points: The decision to fine-tune should follow exhaus...| Alex Strick van Linschoten
Explores Chapter 8 of Chip Huyen's 'AI Engineering,' examining the intricate landscape of dataset engineering through the lenses of curation, augmentation, and processing.| mlops.systems
The past three years have seen significant interest in applying language models to the task of visual document understanding – integrating spatial, textual, and visual signals to make sense of PDFs and scanned documents.| machine learning musings
Had the first of a series of meet-ups I’m organising in which we discuss Chip Huyen’s new book. My notes from reading the chapter follow this, and then I’ll try to summarise what we discussed in the group. At a high-level, I really enjoyed the final part of the chapter where she got into how she was thinking about the practice of ‘AI Engineering’ and how it differs from ML engineering. Also the use of the term ‘model adaptation’ was an interesting way of encompassing all the dif...| Alex Strick van Linschoten
I was on the front page of Hacker News for my two last blog posts and I learned various things forom the discussion and scrutiny of my approach to evaluating my finetuned LLMs.| mlops.systems
I summarise the kinds of evaluations that are needed for a structured data generation task.| mlops.systems
I tried out some services that promise to simplify the process of finetuning open models. I describe my experiences with Predibase, OpenPipe and OpenAI.| mlops.systems
I finetuned my first LLM(s) for the task of extracting structured data from ISAF press releases. Initial tests suggest that it worked pretty well out of the box.| mlops.systems
In this article, we will work with a vision transformer from PyTorch’s Torchvision library, providing simple code examples that you can execute on your own machine without the need to download and install numerous code and dataset dependencies. The self-contained baseline training script comprises approximately 100 lines of code, excluding whitespace and code comments.... Read more »| Lightning AI
LoRA is one of the most widely used, parameter-efficient finetuning techniques for training custom LLMs. From saving memory with QLoRA to selecting the optimal LoRA settings, this article provides practical insights for those interested in applying it.| Lightning AI
A look at extending pre-trained representations with document retrieval to better solve downstream tasks.| machine learning musings
Leveraging the knowledge locked away in language models by reframing categorical tasks as constrained text generation.| machine learning musings