Article URL: https://erikbern.com/2024/09/27/its-hard-to-write-code-for-humans.html Comments URL: https://news.ycombinator.com/item?id=41668304 Points: 69 # Comments: 23| Hacker News: Newest
As I am en route to see my first total solar eclipse, I was curious how hard it would be to compute eclipses in Python.| Erik Bernhardsson
Long story short: I'm working on a super cool tool called Modal. Please check it out — it lets you run things in the cloud without having to think about infrastructure. Scaling out, scheduling, containerization, using GPUs, setting up webhooks, and all kinds of other stuff. It's primarily meant for data teams. We aren't quite live, but you can sign up for our waitlist.| Erik Bernhardsson
This is is in many respects a successor to a| Erik Bernhardsson
Hi! It's your friendly project management theorician. You might remember me from blog posts such as Why software projects take longer than you think, which is a blog post I wrote a long time ago positing that software projects completion time follow a log-normal distribution.| Erik Bernhardsson
Here's a theory I have about cloud vendors (AWS, Azure, GCP):| Erik Bernhardsson
This isn't as much of a blog post as an elaboration of a tweet I posted the other day:| Erik Bernhardsson
I joined Better in early 2015 because I thought the team was crazy enough to actually change one of the largest industries in the US. For six years, I ran the tech team, hiring 300+ people, probably doing 2,000+ interviews, and according to GitHub I added 646,941 lines of code and removed 339,164. But I also got married, had two kids, bought an apartment and renovated it! From time to time, there was some intense periods of hard work.| Erik Bernhardsson
It's a popular attitude among developers to rant about our tools and how broken things are. Maybe I'm an optimistic person, because my viewpoint is the complete opposite! I had my first job as a software engineer in 1999, and in the last two decades I've seen software engineering changing in ways that have made us orders of magnitude more productive. Just some examples from things I've worked on or close to:| Erik Bernhardsson
I spent a ton of time looking at different software providers, both as a CTO, and as a nerd “advanced” consumer who builds stuff in my spare time. In the last 10 years, there has been an order of magnitude more products that cater directly to developers, through APIs, SDKs, and tooling. I'm pretty psyched about this trend. As the cost of building software goes down, that drives up the demand for software engineers. That then drives up the demand for even more products built for software e...| Erik Bernhardsson
We live in a year of about 350,000 amateur epidemiologists and I have no desire to join that “club”. But I read something about COVID-19 deaths that I thought was interesting and wanted to see if I could replicated it through data. Basically the claim is that Sweden had an exceptionally “good” year in 2019 in terms of influenza deaths causing there to be more deaths “overdue” in 2020.| Erik Bernhardsson
Compensation has always been one of the most confusing parts of management to me. Getting it right is obviously extremely important. Compensation is what drives our entire economy, and you could look at the market for labor as one gigantic resource-allocating machine in the same way as people look at the stock market as a gigantic resource-allocating machine for investments.| Erik Bernhardsson
Hanlon's razor is a classic aphorism I'm sure you have heard before: Never attribute to malice that which can be adequately explained by stupidity.| Erik Bernhardsson
Let's consider a toy model where you're hiring for two things and that those are equally valuable. It's not very important what those are, so let's just call them “thing A” and “thing B” for now. For one set of abilities, the scatter plot looks like this:| Erik Bernhardsson
I recently finished the excellent book Kochland. This isn't my first interest in Koch—I read The Science of Success by Charles Koch himself a couple of years ago.| Erik Bernhardsson
Just a quick note that my team is always hiring at Better. A lot of new people have been joining the team here in NYC lately—the tech team has actually grown from 35 to 60 in just ~3 months. We are primarily looking for senior software engineers and/or engineering managers. But we would love to talk if you have less experience too! Our main tech stack is mostly TypeScript and Python.| Erik Bernhardsson
My company has a buffet every Friday, and the lines grow to epic proportions when the food arrives. I've suspected for years that the “classic” buffet line system is a deeply flawed and inefficient method, and every time I'm stuck in the line has made me more convinced.| Erik Bernhardsson
No one asked for this, but I'm something like ~12 years into my career and have had my fair share of mistakes and luck so I thought I'd share some.| Erik Bernhardsson
This is a blog post originally featured on the Better engineering blog. If you want to link to this article or share it, please go to the original post URL! Separately, I'm sorry it's been so long with no posts on this blog. Between kids, moving, and being a startup CTO, I've been busy. I have a few posts coming down the pipe though, so stay tuned…| Erik Bernhardsson
When I started building up a tech team for Better, I made a very conscious decision to pay at the high end to get people. I thought this made more sense: they cost a bit more money to hire, but output usually more than compensates for it. Many fellow CTOs, some went for the other side of the spectrum. This was a mystery to me, until it all made sense to me.| Erik Bernhardsson
A modern tech stack typically involves at least a frontend and backend but relatively quickly also grows to include a data platform. This typically grows out of the need for ad-hoc analysis and reporting but possibly evolves into a whole oil refinery of cronjobs, dashboards, bulk data copying, and much more. What generally pushes things into the data platform is (generally) that a number of things are| Erik Bernhardsson
It started with a tweet:| Erik Bernhardsson
This is a bit of a rant but I really don't like software that invents its own query language. There's a trillion different ORMs out there. Another trillion databases with their own query language. Another trillion SaaS products where the only way to query is to learn some random query DSL they made up.| Erik Bernhardsson
I get bored reading management books very easily and lately I've been reading about a wide range of almost arbitrary topics. One of the lenses I tend to read through is to see different management styles in different environments.| Erik Bernhardsson
As some of you may know, one of my side interests is approximate nearest neighbor algorithms. I'm the author of Annoy, a library with 3,500+ stars on Github as of today. It offers fast approximate search for nearest neighbors with the additional benefit that you can load data super fast from disk using mmap. I built it at Spotify to use for music recommendations where it's still used to power millions (maybe billions) of music recommendations every day.| Erik Bernhardsson
Ok, so I have to first preface this whole blog post by a few things:| Erik Bernhardsson
I have done roughly 2,000 interviews in my life. When I started recruiting, I had so much confidence in my ability to assess people. Let me just throw a couple of algorithm questions at a candidate and then I'll tell you if they are good or not!| Erik Bernhardsson
I've been reading up on operations research lately, including queueing theory. It started out as a way to understand the very complex mortgage process (I work at a mortgage startup) but it's turned into my little hammer and now I see nails everywhere.| Erik Bernhardsson
I started writing this blog in late 2012, partly because I felt like it would help me improve my English and my writing skills, partly because I kept having a lot of random ideas in my head and I wanted to write them down somewhere. I honestly never cared too much about finding a particular niche, I just wanted to write down stuff that I found interesting. I set up a Wordpress blog on my crappy Swedish virtual private server.| Erik Bernhardsson
UPDATE(2018-06-17): There are is a later blog post with newer benchmarks!| Erik Bernhardsson
I'm interrupting the regular programming for a quick announcement: we're looking for data engineers at Better. You would be the first one to join and would work a lot directly with me.| Erik Bernhardsson
Turns out having a toddler isn't super compatible with reading. I used to read ~100 books/year as a teenager, but it has slowly deteriorated to maybe 20-30 books, at most. And I don't even finish all of them because life is too short! Some books are just not that interesting. So what were some of the books worth mentioning?| Erik Bernhardsson
I spent a few days during the holidays fixing up a bunch of semi-dormant open source projects and I have a couple of blog posts in the pipeline about various updates. First up, I made a number of fixes to Git of Theseus which is a tool (written in Python) that generates statistics about Git repositories. I've written about it previously on this blog. The name is a horrible pun (I'm a dad!) on Ship of Theseus which is a philosophical thought experiment about what happens if you replace every s...| Erik Bernhardsson
I spent six years at a company that went from 50 people to 1500 and one contributing factor leading to my departure was that I went from a “maker” to a person stuck in meetings every day. It wasn't that I wanted to do that, but everyone else kept dragging me into meetings.| Erik Bernhardsson
I had an interesting idea a few weeks ago, best explained through an example. Let's say you're running an e-commerce site (I kind of do) and you want to optimize the number of purchases.| Erik Bernhardsson
I've been a bit bad at posting things with a regular cadence lately, partly because I'm trying to adjust to having a toddler, partly because the hunt for clicks has caused such a high bar for me that I feel like I have to post something Pulitzer-worthy. But things are always cooking, so let's break this pattern with a quick notice on something I've been working on!| Erik Bernhardsson
There are often close relationships between top level business metrics. For instance, it's well known that retention has a super strong impact on the valuation of a subscription business. Or that the % of occupied seats is super important for an airline. A fun little toy model that I can up with generates a curious relationship between conversion rates and revenue.| Erik Bernhardsson
A funny thing about being a foreigner is how you realize people take broken things for granted. I'm going to go out on a limb here claiming that the US has a pretty dumb banking system. I could talk about it all day, but right now I want to focus on a very particular piece of it: how to verify your identity online.| Erik Bernhardsson
Just for fun, I generated these graphs of the number of letters in the word for each number. I really spent about 10 minutes on this (ok…possibly also another 40 minutes tweaking the plots):| Erik Bernhardsson
Here's a dumb extremely accurate rule I'm postulating* for software engineering projects: *you need at least 3 examples before you solve the right problem*.| Erik Bernhardsson
I just bought Machine, Platform, Crowd: Harnessing Our Digital Future and discovered that it mentions my blog – in particular the post When machine learning matters.| Erik Bernhardsson
There's about 765 million blog posts about the diversity “memo” that leaked out of Google a couple of weeks ago. I think the case for any biological difference is pretty weak, and it bothers me when people refer to an “interest gap” as anything else than caused by the environment. Maybe because I have a daughter, maybe because I have too many female friends who told me stories how they were held back or discriminated against.| Erik Bernhardsson
I just spent a few days in Italy, on the Ligurian coast. Even though we were on the west side of Italy, the Mediterranean sea was to the east, because the house was situated on a long bay. But zooming in even more, there were parts of the coast that were even more twisted – to the point where it had turned a full 360 degress so you ended up having the sea to the west again.| Erik Bernhardsson
Remember when everyone had a really ugly blog with a blogroll? Anyway, just think the word is funny.| Erik Bernhardsson
How hard can it be to compute conversion rate? Take the total number of users that converted and divide them with the total number of users. Done. Except… it's a lot more complicated when you have any sort of significant time lag.| Erik Bernhardsson
I've read about 100 management books by now but if there's something that always bothered me it's the lack of first principles thinking. Basically it's a ton of heuristics. And heuristics are great, but when you present heuristics as true objectives, it kind of clouds the underlying objectives (and you end up with weird proxy cults like the Agile movement 👹 – not that I disagree with it, I just wish they could derive it from a more systematic understanding of project management).| Erik Bernhardsson
I was reading yet another blog post titled “Why our team moved from <language X> to <language Y>” (I forgot which one) and I started wondering if you can generalize it a bit. Is it possible to generate a N * N contingency table of moving from language X to language Y?| Erik Bernhardsson
I just realized last Thursday that I have spent two full years at Better, incidentally on the same day as we announced a $15M round led by Kleiner Perkins. So it was a good point to reflect a bit and think back – what the F led me to abandon my role managing the machine learning team at Spotify? To join some random startup in the world's most boring industry? So here's my justification why I love being where I am:| Erik Bernhardsson
Here's a fun analysis that I did of the pitch (aka. frequency) of various languages. Certain languages are simply pronounced with lower or higher pitch. Whether this is a feature of the language or more a cultural thing is a good question, but there are some substantial differences between languages.| Erik Bernhardsson
This is a pretty dumb post, in which I argue that functional programming has a lot of the bad parts of libertarianism and a lot of the good parts:| Erik Bernhardsson
This blog post Data sets are the new server rooms makes the point that a bunch of companies raise a ton of money to go get really proprietary awesome data as a competitive moat. Because once you have the data, you can build a better product, and no one can copy it (at least not very cheaply). Ideally you hit a virtuous cycle as well, where usage of your system once it takes of gives even more data, which makes the system even better, which attracts more users…| Erik Bernhardsson
Pareto efficiency is a useful concept I like to think about. It often comes up when you compare items on multiple dimensions. Say you want to buy a new TV. To simplify it let's assume you only care about two factors: price and quality. We don't know what you are willing to pay for quality – but we know that everything else equals:| Erik Bernhardsson
I generally haven't written much about software architecture. People make heuristics into religion. But here is something I thought about: how to build in self-correction into systems. This has been something just vaguely sitting in my head lacking a clear conceptual definition until a whole slew of things popped up today that all had the exact same issue at its core. I'm going to refer to it as state drift lacking a better term for it.| Erik Bernhardsson
I joined Spotify in 2008 to focus on machine learning and music recommendations. It's easy to forget, but Spotify's key differentiator back then was the low-latency playback. People would say that it felt like they had the music on their own hard drive. (The other key differentiator was licensing – until early 2009 Spotify basically just had all kinds of weird stuff that employees had uploaded. In 2009 after a crazy amount of negotiation the music labels agreed to try it out as an experimen...| Erik Bernhardsson
As you may know, one of my (very geeky) interests is Approximate nearest neigbor methods, and I'm the author of a Python package called Annoy.| Erik Bernhardsson
I've been trying to learn Clojure. I keep telling people I meet that I really want to learn Clojure, but still every night I can't get myself to spend time with it. It's unclear if I really want to learn Clojure or just want to have learned Clojure?| Erik Bernhardsson
(I accidentally published an unfinished draft of this post a few days ago – sorry about that).| Erik Bernhardsson
One of my favorite business hobbies is to reduce some nasty decision down to its absolute core objective, decide the most basic strategy, and then add more and more modifications as you have to confront the complexity of reality (yes I have very lame hobbies thanks I know).| Erik Bernhardsson
I do a lot of recruiting and have given maybe 50 offers in my career. Although many companies do, I never put a deadline on any of them. Unfortunately, I've often ended up competing with other companies who do, and I feel really bad that this usually tricks younger developers into signing offers. On numerous occasions, I've gotten an email halfway through the interview process| Erik Bernhardsson
(This is not a very relevant/useful post for regular readers – feel free to skip. I thought I would share it so people can find it on Google.)| Erik Bernhardsson
Here's a conclusion I've made building consumer products for many years: the speed at which a company innovates is limited by its iteration speed.| Erik Bernhardsson
I've been spending several hundred bucks renting GPU instances on AWS over the last year. The speedup from a GPU is awesome and hard to deny. GPUs have taken over the field. Maybe following the footsteps of Bitcoin mining there's some research on using FPGA (I know very little about this).| Erik Bernhardsson
My blog post about fonts generated lots of traffic – it landed on Hacker News, took down my site while I was sleeping, and then obviously vanished from HN before I woke up. But it also got retweeted by a ton of people.| Erik Bernhardsson
For some reason I decided one night I wanted to get a bunch of fonts. A lot of them. An hour later I had a bunch of scrapy scripts pulling down fonts and a few days later I had more than 50k fonts on my computer.| Erik Bernhardsson
The easiest way to be a 10x engineer is to make 10 other engineers 2x more efficient. Someone can be a 10x engineer if they do nothing for 364 days then convinces the team to change programming language to a 2x more productive language.| Erik Bernhardsson
Early last year when I left Spotify I decided to do more reading. I was planning to read at least one book per week and in particular I wanted to brush up on management, economics, and technology. 2015 was also a year of exclusively non-fiction, which is a pretty drastic shift, since I grew up reading fiction compulsively for 20 years.| Erik Bernhardsson
I've been obsessed with how to iterate quickly based on small scale feedback lately. One awesome website I encountered is Usability Hub which lets you run 5 second tests. Users see your site for 5 seconds and you can ask them free-form questions afterwards. The nice thing is you don't even have to build the site – just upload a static png/jpg and collect data.| Erik Bernhardsson
(Warning: super speculative, feel free to ignore)| Erik Bernhardsson
Curious about Google's newly released TensorFlow? I don't have a beefy GPU machine, so I spent some time getting it to run on EC2. The steps on how to reproduce it are pretty brutal and I wouldn't recommend going through it unless you want to waste five hours of your live.| Erik Bernhardsson
I haven't mentioned what I'm currently up to. Earlier this year I left Spotify to join a small startup called Better. We're going after one of the biggest industries in the world that also turns out to be completely broken. The mortgage industry might not be the #1 industry you pictured yourself in, but it's an enormous opportunity to fix a series of real consumer problems and join a company that I predict will be huge.| Erik Bernhardsson
The other day I was looking at marketing spend broken down by channel and wanted to compute some simple uncertainty estimates. I have data like this:| Erik Bernhardsson
I was featured in Peadar Coyle's interview series interviewing various “data scientists” – which is kind of arguable since (a) all the other ppl in that series are much cooler than me (b) I'm not really a data scientist. Anyway, reposting the full interview:| Erik Bernhardsson
This is another post based on my talk at NYC Machine Learning. The previous two parts covered most of the interesting parts, but there are still some topics left to be discussed. To go back and read the meaty stuff, check out| Erik Bernhardsson
This is a blog post rewritten from a presentation at NYC Machine Learning on Sep 17. It covers a library called Annoy that I have built that helps you do nearest neighbor queries in high dimensional spaces. In the first part, I went through some examples of why vector models are useful. In the second part I will be explaining the data structures and algorithms that Annoy uses to do approximate nearest neighbor queries.| Erik Bernhardsson
This is a blog post rewritten from a presentation at NYC Machine Learning last week. It covers a library called Annoy that I have built that helps you do (approximate) nearest neighbor queries in high dimensional spaces. I will be splitting it into several parts. This first talks about vector models, how to measure similarity, and why nearest neighbor queries are useful.| Erik Bernhardsson
A couple of people in my old team have been around talking about how Spotify does music recommendations and put together some quite good presentations.| Erik Bernhardsson
I was playing around with D3 last night and built a silly visualization of antipodes and how our intuitive understanding of the world sometimes doesn't make sense. Check out the visualization at bl.ocks.org!| Erik Bernhardsson
Every once in a while when talking to smart people the topic of automation comes up. Technology has made lots of occupations redundant, so what's next?| Erik Bernhardsson
Here's a problem that I used to give to candidates. I stopped using it seriously a long time ago since I don't believe in puzzles, but I think it's kind of fun.| Erik Bernhardsson
Annoy is a library written by me that supports fast approximate nearest neighbor queries. Say you have a high (1-1000) dimensional space with points in it, and you want to find the nearest neighbors to some point. Annoy gives you a way to do this very quickly. It could be points on a map, but also word vectors in a latent semantic representation or latent item vectors in collaborative filtering.| Erik Bernhardsson
The workflow engine battle has intensified with some more interesting entries lately! Here are a couple I encountered in the last few days. I love that at least two of them are direct references to Luigi!| Erik Bernhardsson
I have spent some time lately with D3. It's a lot of fun to build interactive graphs. See for instance this demo (will provide a longer writeup soon).| Erik Bernhardsson
Note: this post is full of pseudo-psychology and highly speculative content. Like most fun stuff!| Erik Bernhardsson
Saw this link on Hacker News the other day: The Highway Lane Next to Yours Isn’t Really Moving Any Faster| Erik Bernhardsson
Sometimes you have these awesome insights. A few days ago I got an idea for how to improve index building in Annoy.| Erik Bernhardsson
Annoy is a C++/Python package I built for fast approximate nearest neighbor search in high dimensional spaces. Spotify uses it a lot to find similar items. First, matrix factorization gives a low dimensional representation of each item (artist/album/track/user) so that every item is a k-dimensional vector, where k is typically 40-100. This is then loaded into an Annoy index for a number of things: fast similar items, personal music recommendations, etc.| Erik Bernhardsson
I just pinged a few million random IP addresses from my apartment in NYC. Here's the result:| Erik Bernhardsson
There's a bunch of companies working on machine learning as a service. Some old companies like Google, but now also Amazon and Microsoft.| Erik Bernhardsson
As noted by multipletweets, my previous post describes a phenomenon denoted Berkson's paradox.| Erik Bernhardsson
I saw a bunch of tweets over the weekend about Peter Norvig claiming there's a negative correlation between being good at programming competitions and being good at the job. There were some decent Hacker News comments on it.| Erik Bernhardsson
Pinterest just open sourced Pinball which seems like an interesting Luigi alternative. There's two blog posts: Pinball: Building workflow management (from 2014) and Open-sourcing Pinball (from this week). The author has a comment in the comments thread on Hacker News:| Erik Bernhardsson
Wow I guess it was more than a year ago that I tweeted this. Crazy how time flies by. Anyway, here's my rationale:| Erik Bernhardsson
How to sabotage software productivity, in the style of CIA| Erik Bernhardsson
I've written before about the importance of iterating quickly but I didn't necessarily talk about some concrete things you can do. When I've built up the tech team at Better, I've intentionally optimized for fast iteration speed above almost everything else.| Erik Bernhardsson
As a project evolves, does the new code just add on top of the old code? Or does it replace the old code slowly over time? In order to understand this, I built a little thing to analyze Git projects, with help from the formidable GitPython project.| Erik Bernhardsson
Anyone who built software for a while knows that estimating how long something is going to take is hard. It's hard to come up with an unbiased estimate of how long something will take, when fundamentally the work in itself is about solving something.| Erik Bernhardsson
You are brought into a startup to run their three-person data team. This is a story about teams and organization, and how you spend a year getting the team to a good place.| Erik Bernhardsson
Why does it suck to wait for things? In a previous post I analyzed a NYC subway dataset and found that at some point, quite early, it's worth just giving up. This isn't a proof that the subway doesn't run on time — in fact it might actually proves that the subway runs really well.| Erik Bernhardsson
Apparently MTA (the company running the NYC subway) has a real-time API. My fascination for the subway takes autistic proportions and so obviously I had to analyze some of the data.| Erik Bernhardsson