I was on the front page of Hacker News for my two last blog posts and I learned various things forom the discussion and scrutiny of my approach to evaluating my finetuned LLMs.| mlops.systems
I summarise the kinds of evaluations that are needed for a structured data generation task.| mlops.systems
I tried out some services that promise to simplify the process of finetuning open models. I describe my experiences with Predibase, OpenPipe and OpenAI.| mlops.systems
I finetuned my first LLM(s) for the task of extracting structured data from ISAF press releases. Initial tests suggest that it worked pretty well out of the box.| mlops.systems
I evaluated the baseline performance of OpenAI's GPT-4-Turbo on the ISAF Press Release dataset.| mlops.systems
I used Instructor to understand how well LLMs are at extracting data from the ISAF Press Releases dataset. They did pretty well, but not across the board.| mlops.systems
I'm publishing a unique new dataset of Afghan newspaper and magazine articles from the 2006-2009 period.| mlops.systems