New blog posts will show up at| Data Science Castnet
Last week a guy called Evan Miller tweeted out a blog post claiming to have discovered a flaw in the attention mechanism used by transformers today: The phrasing was sensationalist, and many people…| Data Science Castnet
While I’ve previously consulted on NLP projects, in the past few years my research focus has been chiefly on images. If you had asked me a few months ago about looking at LLMs, my default res…| Data Science Castnet
In this series, I’d like to explore how to take an idea within machine learning from proof of concept to production. This first post is going to get things going with a little mini-project th…| Data Science Castnet
I was briefly nerd-sniped this morning by the following tweet: Can we quantify how ‘predictable’ a set of lyrics are? Language Models and Token Probabilities A language model is a neura…| Data Science Castnet
A few recent projects I’ve worked on have been documented elsewhere but haven’t made it to this blog. The point of this post is to summarize these so that they aren’t lost in the …| Data Science Castnet
As part of the Huggingface ‘#huggan’ event, I thought it would be interesting to fine-tune a latent diffusion model on the WikiArt dataset, which (as the name suggests) consists of pain…| Data Science Castnet
The model demo running on Huggingface Spaces I wanted a fast way to go from an image to something like a rough charcoal sketch. This would be the first step in a longer pipeline that would later ad…| Data Science Castnet
NB: A scoring glitch caused this approach to look very good on the leaderboard, but local validation and a fix from Zindi later confirmed that it isn’t as magical as it first seemed. Still in…| Data Science Castnet
Generative models are all the rage at the moment, and quality seems to be skyrocketing across the board. In this post, I share what I’m realizing is *the* key recipe that is powering the best…| Data Science Castnet