Missing the Forest for the Sequence Trees.| lewiscampbell.tech
Learn how queues make horizontal scaling, scheduling, and flow control easier in cloud systems, and how to make them durable and observable.| www.dbos.dev
Dens Sumesh - Software Engineer| densumesh.dev
Why do we use caches at all? Can databases fully replace them?| avi.im
A site with statistics regarding the decentralization status of various web services| arewedecentralizedyet.online
Agile approaches like Scrum recommend a "just enough" attitude in software development and this is also the case when you discuss tools. Ideally, you would work with a small team that is collocated, but this is not always possible and you might be running your project virtually with a distributed Scrum team scattered around the world. If you don't want to start using a sophisticated tool to manage your efforts, you might be interested in adopting some web tools that will fit your particular n...| Scrum Agile Project Management Expert
Fixed-size ring buffers with full lock freedom| h4x0r.org
Consultancy Envol says nascent Joint Technical Note No. 13/2025 could bring investor security with the potential to install at least 2 GW of batteries. Industry insiders are calling for storage regulation to be accelerated. The post ‘Brazilian storage regulation could generate $2bn in five years’ appeared first on Energy Storage.| Energy Storage
There’s a semi-well-known adage in software development that says when you have a hard code change, you should “first make the hard change easy, and then make the easy change.” In other words, refactor the code (or do whatever else you need to do) to simplify the change you’re trying to make| blog.appliedcomputing.io
Sharing is Scaring: Why is Cloud File-Sharing Hard?| blog.brownplt.org
L2AW| law-theorem.com
Reinforcement learning meets iterated game theory meets theory of mind| The Dan MacKinlay stable of variably-well-consider’d enterprises
“The [structural] mechanism producing these problematic outcomes is really robust and hard to resolve.”…| Ars Technica
Matrix, the open protocol for secure decentralised communications| matrix.org
What even is distributed systems| notes.eatonphil.com
1 Background| jepsen.io
Viewstamped replication(VR) is a replication technique that takes care of failures when one or more nodes end up crashing in a cluster. It works as a wrapper on top of a non-distributed system & allows the underlying business logic to be applied independently while the protocol itself takes care of replication. The protocol was introduced in the paper and then was revised with a set of optimizations under a new paper known as Viewstamped replication revisited. | Distributed Computing Musings
If you’re building a backend mostly alone, Elixir lets you avoid service sprawl and ship features faster.| I'm Konstantin
Figure 1 The open-source AI scene has been kicking goals. People have wrestled models, datasets, and all the fixings away from the big-wigs. The final boss of that game is the access to expensive compute. Training a foundation model from scratch takes a warehouse full of GPUs that costs more than a small nation’s GDP. It’s been the one thing keeping AI development firmly in the hands of a few tech giants with cash to burn. Until now, maybe. So, the citizen science equivalent for the NN a...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Sungrow Power has revealed its newest hybrid residential energy storage system. Upgrades to its MG Series inverters and a battery will be launched in Q4.| Energy Storage
Cloud databases face a fundamental challenge: how to remain available and durable under node failures? Modern cloud databases approach this by separating two concerns that used to be tightly coupled: compute and storage. The database engine becomes stateless, while the write-ahead log gets replicated across multiple nodes to guarantee durability. If a database server dies, another one can pick up exactly where it left off by reading from the replicated log.| Benjamin Hilprecht
Matrix, the open protocol for secure decentralised communications| matrix.org
We share learnings and best practices for testing durable execution based on our testing of DBOS.| www.dbos.dev
A list of key concepts for building and testing reliable distributed systems, with basic definitions and deep references.| antithesis.com
Coarse-graining empowerment| The Dan MacKinlay stable of variably-well-consider’d enterprises
Monitoring my Homelab, Simply| b.tuxes.uk
I’ve had a few conversations about async code recently (and not so recently) and seen some code that seems to make wrong assumptions about async, so I figured out it was time to have a serious chat about async, what it’s for, what it guarantees and what it doesn’t.| Il y a du thé renversé au bord de la table !
Lots of projects claim to be the “smallest” or “simplest” Kubernetes, but they never provide data to back it up. Let’s look at how these distributions compare to Talos Linux. Note that Talos Linux is not a Kubernetes distribution, but rather a Linux distribution purpose-built for running upstream Kubernetes. Before we look at the data, […]| Sidero Labs
How we designed our database for complete control over concurrency, time, randomness, and failure injection.| Discover the Performance Engineer in you. | Polar Signals
Zero-ETL search and analytics for Postgres| ParadeDB
Tail Latency Might Matter More Than You Think| brooker.co.za
Put that script inside a folder, share the folder with someone via Syncthing or Dropbox or whatever,| holdtherobot.com
Learn how to capture and inspect traffic to your Kubernetes resources using mirrord dump which is a built-in tool for debugging.| MetalBear 🐻
LLM agents don’t require new architectures, but they do break the assumptions your platform relies on for safety, observability, and control. This post breaks down what really changes when agents hit production, and what platform teams should do about it.| www.junctionlabs.io
Retries| justinblank.com
Last update: March 28, 2023.| www.infocentral.org
Author: Igor Konnov| Protocols Made Fun
This post addresses the "quiet fediverse" problem, where users often experience fragmented conversations on decentralized social networks. The core issue stems from ActivityPub's distributed nature, where conversations are spread across multiple servers, leading to incomplete views of discussions. The author explores two main approaches to solve this: reply tree crawling and the context owner approach. Reply tree crawling, pioneered by Mastodon, involves fetching all replies to reconstruct th...| Hackers' Pub
rqlite is a lightweight, open-source, distributed relational database written in Go, which uses SQLite as its storage engine and Raft for consensus. Ten years ago, I tagged the first release of rqlite—a project I started to explore distributed consensus, to learn Go, and to build something useful. I had no roadmap, no plans for scale, and certainly no…| Vallified
MSc Computer science student| Fabian Lindfors
How deterministic simulation testing can help us build more reliable distributed systems and bridge the gap between development and production environments.| Pierre Zemb's Blog
Figure 1 An interesting inverse design question: how should I design a system to optimise for truthfulness? Brief summary here. @Frongillo2024Recent: This note provides a survey for the Economics and Computation community of some recent trends in the field of information elicitation. At its core, the field concerns the design of incentives for strategic agents to provide accurate and truthful information. Such incentives are formalized as proper scoring rules, and turn out to be the same obj...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Learning agents in a multi-agent system which account for and/or exploit the fact that other agents are learning too. This is one way of formalising the idea of theory of mind. Learning with theory of mind works out nicely for reinforcement learning, in e.g. opponent shaping, and may be an important tool for understanding AI agency and AI alignment, as well as aligning more general human systems. Other interesting things might arise from a good theory of other-aware learning, such ...| The Dan MacKinlay stable of variably-well-consider’d enterprises
1 Update, 2025-05-03| jepsen.io
The last few days I spent some time digging into the recently announced KIP-1150 ("Diskless Kafka"), as well AutoMQ’s Kafka fork, tightly integrating Apache Kafka and object storage, such as S3. Following the example set by WarpStream, these projects aim to substantially improve the experience of using Kafka in cloud environments, providing better elasticity, drastically reducing cost, and paving the way towards native lakehouse integration. This got me thinking, if we were to start all ove...| www.morling.dev
A curated collection of resources about deterministic simulation testing for distributed systems.| Pierre Zemb's Blog
We’re introducing a new layer of verification — a user-friendly, easily recognizable badge. Additionally, independent organizations can verify accounts directly through our Trusted Verifiers feature.| Bluesky
I am currently winding down the Mastodon bots I used to post sunrise and sunset times. The precipitating event is that the admin of the instance hosting the associated accounts demanded they be made nigh-undiscoverable, but the underlying cause is that it’s become increasing clear that Mastodon isn’t, and won’t ever be, a good platform for “asynchronous ephemeral notifications of any kind”. I’d also argue (more controversially) that it’s simply not good infrastructure for social...| Rob’s Posts
Personal website for some random tidbits I work on| maknee.github.io
Tips and lessons learned from building systems directly against object stores| SpiralDB
Bug Bash 2025 Conference Experience| concerningquality.com
#Git With Me| git.sr.ht
Neighbourhoodie Software is a software development company based in Berlin, Germany. We are experts in CouchDB, PouchDB, and Offline First.| neighbourhood.ie
Learn how Sequin implements Postgres logical replication with guaranteed message delivery and ordering. Discover how we built a high-throughput data pipeline without missing events.| Sequin blog
Figure 1 On rituals without (necessarily) faith. TBD Related: tribal bonding, mind altering substances. 1 Incoming Future Day 2025 – Science, Technology & the Future Wheal’s Homegrown Humans Newsletter Ritual Behavior, Habits, Human Culture, Religion, Civilization, Marriage, Death, Burning Man & Community | Dimitris Xygalatas | #75 (5) Psychedelics, Civilization, Religion, Death & Plant Medicine | Brian Muraresku | #1 Shamanism, Psychedelics, Social Behavior, Religion & Evolution of Huma...| The Dan MacKinlay stable of variably-well-consider’d enterprises
Figure 1 Notebook on the idea of human domestication. Possibly the opposite of being stroppy. Paul Christiano, What failure looks like Amongst the broader population, many folk already have a vague picture of the overall trajectory of the world and a vague sense that something has gone wrong. There may be significant populist pushes for reform, but in general these won’t be well-directed. Some states may really put on the brakes, but they will rapidly fall behind economically and militaril...| The Dan MacKinlay stable of variably-well-consider’d enterprises
In the "Let’s Take a Look at…!" blog series I am going to explore interesting projects, developments and technologies in the data and streaming space. This can be KIPs and FLIPs, open-source projects, services, and more. The idea is to get some hands-on experience, learn about potential use cases and applications, and understand the trade-offs involved. If you think there’s a specific subject I should take a look at, let me know in the comments below! That guy above? Yep, that’s me, w...| www.morling.dev
The financial transactions database designed for mission critical safety and performance. - tigerbeetle/tigerbeetle| GitHub
a git collaboration platform, built on atproto| blog.tangled.sh
Here is how to implement a distributed lock with S3| quanttype.net
Antithesis' ability to play like a computer, not a human being, is central both to finding bugs and beating side-scrolling shooters.| antithesis.com
Why are scalable systems locally-inefficent, and locally-efficient systems unscalable? Plus, new book release!| buttondown.com
If you’re interested in supporting mybinder.org with cloud resources, financial resources, or human resources, please see the Support Binder page for how you can help. tl;dr: The 2i2c team is joining the mybinder.| 2i2c
Nation-scale Matrix deployments will fail if built on the community version of Synapse. Huge deployments need a different architecture, which is what Synapse Pro delivers.| Element Blog
Pushing the whole company into the past on purpose| rachelbythebay.com
OK, queues.| ferd.ca
While it’s trivial to measure the end-to-end runtime of a Dask workload, the next logical step - breaking down this time to understand if it could be faster - has historically been a much more arduous task that required a lot of intuition and legwork, for novice and expert users alike. We wanted to change that.Populated Fine Performance Metrics dashboard| Blog
Hendrik Makait2023-05-16| Blog
Miles Granger| Blog
At Coiled we develop Dask and automatically deploy it to large clusters of cloud workers (sometimes 1000+ EC2 instances at once!). In order to avoid surprises when we publish a new release, Dask needs to be covered by a comprehensive battery of tests — both for functionality and performance.Nightly tests report| Blog
Hendrik Makait| Blog
It’s finally done! 🎉 I am so excited to announce “Communicating Chorrectly with a Choreography”, the first zine from my research group. You can read it online, or print your own free copies to read offline!| decomposition ∘ al
The | vereis.com
Usually, in an event-driven architecture, events are emitted by one service and listened to by many (1:n). But what if it's the other way around? If one service needs to listen to events from many other services?| www.reactivesystems.eu
Cloudflare consistently generates the highest quality public incident writeups of any tech company. Their latest is no exception: Cloudflare incident on November 14, 2024, resulting in lost logs. I…| Surfing Complexity
Recently due to various events (namely a lot of people getting off of| dustycloud.org
Incremental computation represents a transformative (!) approach to data processing. Instead of recomputing everything when your input chang...| muratbuffalo.blogspot.com
November has sucked so far. One upside of the terrible nonsense is that more people are fleeing X. Many are choosing Bluesky. I’ve seen a bunch of takes about this recently, but I keep seeing things I disagree with. I figure that’s a good enough excuse to write more about this weird-assed social network.| anderegg.ca
If you were to design an open social networking protocol, what would that look like? Which metaphors and comparisons would you use to get a general idea of how the network functions? And what would you answer if people ask if your network is decentralised and federated?| fediversereport.com
We can categorize sync platforms across nine dimensions: data size, data update rate, the structure of the data, input latency, offline support, numbe...| stack.convex.dev
Author: Igor Konnov| Protocols Made Fun
A recurring line in discussion of federated, decentralized social media is that no one cares about it. They just want their Twitter without the Nazis. Which is okay. But how it looks on the backend matters. When the illusion of a unified user experience breaks, how accessible the escape pod| Kye Fox
Given that events play such a central role in event-driven architecture, there’s an astonishing lack of agreement on what should be contained in an event. This may be rooted in the fact that, depending on your perspective, events fulfill different purposes.| www.reactivesystems.eu
Sync platforms like Convex simplify distributed state management, ensuring that developers can focus on building their applications rather than managi...| stack.convex.dev
An idiot (Sean Tilley) talks about his feelings regarding the new Social Web Foundation. He's probably wrong, and you should flame him.| deadsuperhero
Note: I originally wrote this on an internal Amazon blog in 2006. This is the original version with a few edits. A newer paper covering the same content can be found on the AWS Builder's Library, Avoiding fallback in distributed systems. More complicated software is more buggy, so programmers try to apply Occam's Razor and code as simply as possible (but no simpler). But how does one define "complexity"?| a-nickels-worth.dev
Simplifying systems to deliver stability by avoiding scaling during times of stress.| Amazon Web Services, Inc.
Abstract| superdurszlak - Distributed Systems by Szymon Durak
Debunking the myth of "exactly-once delivery." Learn the real differences between messaging system guarantees and what they mean for your architecture.| Sequin blog
An opportunity for everyone to make a little self-test. Do you believe any of these five statements? If so, don't worry, you're not the only one, I've come across them many times. I'm very convinced they're untrue, though. This is my little attempt to better a shared understanding of some properties of event-driven architecture.| www.reactivesystems.eu
There’s much uncertainty and doubt (and maybe even fear?) around event-driven architecture. One example is the belief that it’s irrelevant for REST APIs, as using HTTP verbs is quite clearly not event-driven. But behold - you don’t always have to go all-in to win.| www.reactivesystems.eu
Yesterday I read an article describing the GCRA rate limiting| dotat.at
There are significant changes happening in distributed systems.| Colin Breck
In distributed systems, for instance when scaling out some workload to multiple compute nodes, it is a common requirement to select a leader for performing a given task: only one of the nodes should process the records from a Kafka topic partition, write to a file system, call a remote API, etc. Otherwise, multiple workers may end up doing the same task twice, overwriting each other’s data, and worse.| www.morling.dev
Sometimes, a seemingly simple and obvious solution can lead to a series of problems later on. This is especially true when adding retries.| Medium
TIL: Mermaid Gantt diagrams are great for displaying distributed traces in Markdown| brycemecum.com
How Notion build and grew our data lake to keep up with rapid growth| Notion