On Tuesday the 22nd of July around 09:00 UTC, our users have experienced a slower-than-usual website, API and application. This continued until the next day, July 23rd and was ultimately resolved around 17:30 UTC. This post highlights the cause and the fix we've applied. There were several misleading metrics that caused this issue to last longer than we'd have expected that make for an interesting retrospective. As is always the case when debugging: sometimes it's hard to distinguish cause fr...