Free-threading in Python is finally here. Here’s why it matters and how I plan to use it.| Lukas Valatka
I’m a software engineer. I have 6+ years of experience building production machine learning systems — think efficient model deployment stacks, high-throughput feature pipelines, low-latency model serving and monitoring — commonly known as MLOps. This involves non-trivial amount of data engineering, backend development, and some data science, primarily using Python, Go, and cloud stacks.| Lukas Valatka
One of the ideas from A Philosophy of Software Design that struck me was that good method interfaces should be shallow. Fewer parameters make it more obvious how to use a method - less cognitive load incentivizes folks to actually use your API. Fewer parameters mean less levers, so lower chance of misuse. Fewer parameters reduce coupling, making method changes less risky.| Lukas Valatka
I love Kubernetes — for production, it’s rock-solid. But throw me a notebook that needs 0.5 TB of RAM and a GPU, and suddenly K8s feels like a chore.| Lukas Valatka
I was attending a DuckDB meetup in Leuven, where a presenter was detailing how they replaced Spark with DuckDB to cut costs, when rather abruptly, someone in the audience asked, “How do I know if I should really use DuckDB now, me being not a data engineer? What if my query is heavier than expected? What if my data fluctuates and just sometimes goes beyond a single machine?”. The presenter admitted that you set up the architecture once, do your best benchmarking, and hope for the data dyn...| Lukas Valatka
I’ll give it a few years until MLflow dominates the model package format space, with alternatives like SageMaker models fading away, and sharing pure weights becoming an arcane art. But until that dominance is absolute, I’ve been thinking that there’s another quite obvious way to package models: just store them as wheels. Packaging == persisting the trained model.| Lukas Valatka
For the past few months, I’ve been exploring Go. Having done quite a bit of grueling work shaving off milliseconds from Python web apps, I’ve found Go to be incredible. You can schedule dirt cheap concurrent operations — simply by adding go in front of a function call — and achieve true parallelism across cores.| Lukas Valatka
I have always been intrigued by Bloom filters. They are very similar to hash sets but somehow consume much less memory. Sometimes, they can yield false positives, creating a peculiar tradeoff.| Lukas Valatka
Recently a data platform vendor introduced us to their latest offering: a feature store module for machine learning projects. They highlighted the usual selling points, like reducing train-serve skew, feature backfilling, and enhancing feature documentation. However, one of our experienced senior data engineers remained unconvinced. The question on their mind was straightforward: “Is a feature store truly necessary when we already have a fully operational data platform?”. Indeed, there mi...| Lukas Valatka
This weekend, I participated in FOSDEM 2025, which impressed me more than I had imagined. Boy, was it big—over 1,000 talks! Here are a few notable takeaways:| Lukas Valatka
In my view, neither performance nor trying to be Python-aligned is what sets uv apart. Don’t get me wrong — try switching from uv to Poetry, and you’ll quickly notice how sluggish it (poetry) feels. uv goes extra miles to adhere to PEPs, and IMHO it’s the go-to package manager for Python these days. But these aren’t the features that surprised me most.| Lukas Valatka