This is a short follow-up to Murat’s PigPaxos post. I strongly recommend reading it first as it provides full context for what is to follow. And yes, it also includes the explanation of what pigs have to do with Paxos.| Aleksey Charapko
Metastable failures in distributed systems are failures that “feed” and strengthen their own “failed” condition. The main characteristic of a metastable failure is a positive feedback loop that keeps the system in a degraded/failed state. These failures are hard to spot, as they always start with some other distraction — some trigger event that nudges the system over its operating limit or capacity. However, fixing the trigger is not enough, and engineers realize this too late — s...| Aleksey Charapko
I will be presenting our new paper, “HoliPaxos: Towards More Predictable Performance in State Machine Replication,” at the VLDB’25. Feel free to ping me if you are there and want to chat!| Aleksey Charapko
Many, if not all, practical distributed systems rely on partial synchrony in one way or another, be it a failure detection, a lease mechanism, or some optimization that takes advantage of synchrony to avoid doing a bunch of extra work. These partial synchrony approaches need to know some crucial parameters about their world to estimate how much synchrony to expect. A misjudgment of such synchrony guarantees may have significant negative consequences. While most practical systems plan for peri...| Aleksey Charapko
Fall 2025 Reading List (##201-210)| charap.co
Here is a list for the Fall 2025 semester. Please join the reading group here: https://discord.gg/VS7J4PAU58. We meet on Thursdays. The schedule is also available on our calendar. | Aleksey Charapko
The last paper we covered in the Distributed Systems Reading group discussed CPUs, data centers, scheduling, and carbon emissions—we read “The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware Scheduling.” Below is my improvised presentation of this paper for the reading group. | Aleksey Charapko
Last week, we read “Databases in the Era of Memory-Centric Computing” CIDR’25 paper in our reading group. This paper argues that the rising cost of main memory and lagging improvement in memory bandwidth do not bode well for traditional computing architectures centered around computing devices (i.e., CPUs). As CPUs get more cores, the memory bandwidth available to each core remains flat or even decreases, as shown in the figure below. Similarly, the cost of memory does not decrease as f...| Aleksey Charapko
Last week we read “OLTP Through the Looking Glass 16 Years Later: Communication is the New Bottleneck” CIDR’25 paper by Xinjing Zhou, Viktor Leis, Xiangyao Yu, Michael Stonebraker. This paper revisits the original “OLTP Through the Looking Glass, and What We Found There” paper and examines the bottlenecks in modern OLTP databases.| Aleksey Charapko
We have been doing a Zoom distributed systems paper reading group for 5 years and have covered around 190 papers. This semester, we should reach the milestone of 200 papers. Over the years, my commitment to the group has varied — at some point, I was writing paper reviews, and more recently, I’ve had less time to do that. However, I will try to recommit to these reviews in some short form… And so, our 191st paper was “Occam’s Razor for Distributed Protocols” from SoCC’24.| Aleksey Charapko
Occam’s Razor for Distributed Protocols [SoCC’24]| Aleksey Charapko
Starburst: A Cost-aware Scheduler for Hybrid Cloud [ATC’24] | Aleksey Charapko
Last time, I briefly talked about my pile of eternal rejections. Today, I will describe another paper that has been sitting in that pile for quite some time. It seems like this particular work, done by my student Bocheng Cui, has found its home, though, and by the skin of its teeth, it will appear in SRDS’24.| Aleksey Charapko
This semester, I taught a class that I am extremely unqualified to teach. It was super fun!| Aleksey Charapko
This is a list of papers for the DistSys reading group summer term. The schedule is also available on our Google Calendar.| Aleksey Charapko
I have a “pile” of papers that continuously get rejected from any conference. All these papers, according to the reviews, “lack novelty,” and therefore are deemed “not interesting” by the reviewing experts. There are some things in common in these papers — they are either observational or rely on old and proven techniques to solve a problem or improve a system/algorithm. Jokingly, I call this set of papers the “pile of eternal rejections.” Recently, the pile...| Aleksey Charapko
A CloudScale Characterization of Remote Procedure Calls [SOSP’23]| Aleksey Charapko
For the 153rd reading group paper, we read something very different this time: “Deep Note: Can Acoustic Interference Damage the Availability of Hard Disk Storage in Underwater Data Centers?” This HotStorage short paper explores the possibility of using soundwaves to attack underwater data centers. In particular, the paper shows how magnetic disks can be disrupted by sound, causing them to degrade or even outright stop working, leading to failures of whatever systems were relying on these ...| Aleksey Charapko
The 152nd reading group meeting continued the microservices discussion started in the 151st paper. We read the “Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications” SOSP’23 paper by Vaastav Anand, Deepak Garg, Antoine Kaufmann, and Jonathan Mace. The premise of the Blueprint is the separation of the app’s logic and most of the infrastructure/ops concerns. | Aleksey Charapko
We kicked off the winter term set of papers in the reading group with the “Towards Modern Development of Cloud Applications” HotOS’23 paper. The paper proposes a different approach to designing distributed applications by replacing the microservice architecture style with something more fluid.| Aleksey Charapko
Fall 2025 Reading List (##201-210)| charap.co