This paper (to appear in VLDB'25) proposes a consensus algorithm called "Cabinet", which dynamically adjusts node weights based on responsi...| muratbuffalo.blogspot.com
This paper from SIGMOD 2016 proposes a transaction healing approach to improve the scalability of Optimistic Concurrency Control (OCC) in m...| muratbuffalo.blogspot.com
This Google paper (to appear in VLDB'25) is about not blowing up your production system. That is harder than it sounds, especially with stateful applications with memories. When rolling out new versions of stateful applications, the "shared, persistent, mutable" data means bugs can easily propagate across versions. Modern rollout tricks (canaries, blue/green deployments) don't save you from this. Subtle cross-version issues often slip through pre-production testing and surface in production, ...| Metadata
This paper (VLDB'2024) looks at boosting transaction throughput through better scheduling. The idea is to explore the schedule-space more systematically and pick execution orders that reduce conflicts.| Metadata
The paper (arXiv 2020, also AI review 2023) opens up with discussing recent high-profile AI debates: the Montréal AI Debate and the AAAI 2020 fireside chat with Kahneman, Hinton, LeCun, and Bengio. A consensus seems to be emerging: for AI to be robust and trustworthy, it must combine learning with reasoning. Kahneman's "System 1 vs. System 2" dual framing of cognition maps well to deep learning and symbolic reasoning. And AI needs both.| Metadata
The paper (2023) argues for integrating two historically divergent traditions in artificial intelligence (neural networks and symbolic reasoning) into a unified paradigm called Neurosymbolic AI. It argues that the path to capable, explainable, and trustworthy artificial intelligence lies in marrying perception-driven neural systems with structure-aware symbolic models. | Metadata
This paper from HotStorage'25 presents OrcaCache, a design proposal for a coordinated caching framework tailored to disaggregated storage systems. In a disaggregated architecture, compute and storage resources are physically separated and connected via high-speed networks. These became increasingly common in modern data centers as they enable flexible resource scaling and improved fault isolation. (Follow the money as they say!) But accessing remote storage introduces serious latency and effi...| Metadata
Aleksey and I sat down to read this paper on Monday night. This was an experiment which aimed to share how experts read papers in real tim...| muratbuffalo.blogspot.com
This paper (PODC'2016) presents a clean and declarative treatment of Snapshot Isolation (SI) using dependency graphs. It builds on the foundation laid by prior work, including the SSI paper we reviewed recently, which had already identified that SI permits cycles with two adjacent anti-dependency (RW) edges, the so-called inConflict and outConflict edges. While the SSI work focused on algorithmic results and implementation, this paper focuses more on the theory (this is PODC after all) of def...| Metadata
I know I should call this recent listens, but I am stuck with the series name. So here it goes. These are some recent "reads" this month.| Metadata
This EuroSys '23 paper reads like an SOSP best paper. Maybe it helped that EuroSys 2023 was in Rome. Academic conferences are more enjoyabl...| muratbuffalo.blogspot.com
This paper (SIGMOD '08) proposes a lightweight runtime technique to make Snapshot Isolation (SI) serializable without falling back to locki...| muratbuffalo.blogspot.com
ATC and OSDI ran in parallel. As is tradition, OSDI was single-track; ATC had two parallel tracks. The schedules and papers are online as linked above.| Metadata
This week I was in Boston for ATC/OSDI’25. Downtown Boston is a unique place where two/three-hundred-year-old homes and cobblestone streets are mixed with sleek buildings and biotech towers. The people here look wicked smart and ambitious (although lacking the optimism/cheer of Bay area people). It’s a sharp contrast from Buffalo, where the ambition is more about not standing out.| Metadata
Chapter 7 of the Concurrency Control and Recovery in Database Systems book by Bernstein and Hadzilacos (1987) tackles the distributed commit problem: ensuring atomic commit across a set of distributed sites that may fail independently.| Metadata
On distributed systems broadly defined and other curiosities. The opinions on this site are my own.| muratbuffalo.blogspot.com
With Chapter 6, the Concurrency Control and Recovery in Database Systems book shifts focus from concurrency control to the recovery! This c...| muratbuffalo.blogspot.com
So it goes: your system is purring like a tiger, devouring requests, until, without warning, it slumps into existential dread. Not a crash. Not a bang. A quiet, self-sustaining collapse. The system doesn’t stop. It just refuses to get better. Metastable failure is what happens when the feedback loops in the system go feral. Retries pile up, queues overflow, recovery stalls. Everything runs but nothing improves. The system is busy and useless.| Metadata
On distributed systems broadly defined and other curiosities. The opinions on this site are my own.| muratbuffalo.blogspot.com
Chapter 2 of Concurrency Control and Recovery in Database Systems (1987) by Bernstein, Hadzilacos, and Goodman is a foundational treatment ...| muratbuffalo.blogspot.com
Chapter 5 of Concurrency Control and Recovery in Database Systems (1987) introduces multiversion concurrency control (MVCC), a fundamental advance over single-version techniques. Instead of overwriting data, each write operation creates a new version of the data item. Readers can access older committed versions without blocking concurrent writes or being blocked by concurrent writes.| Metadata
Chapter 4 of the Concurrency Control and Recovery in Database Systems book (1987) opens with a sentence that doesn't quite pass the grammar test: "In this chapter we will examine two scheduling techniques that do not use locks, timestamp ordering (TO) and serialization graph testing (SGT)." That comma is trying to do the job of a colon and failing at it. Precision matters, more so in technical writing.| Metadata
Chapter 3 presents two-phase locking (2PL). Remember I told you in Chapter 2: Serializability Theory that the discussion is very scheduler-centric? Well, this is a deeper dive into the scheduler, using 2PL as the concurrency control mechanism. The chapter examines the design trade-offs in scheduler behavior, proves the correctness of basic 2PL, dissects how deadlocks arise and are handled, and discusses many variations and implementation issues.| Metadata
I attended the TLA+ Community Event at Hamilton, Ontario on Sunday. Several talks pushed the boundaries of formal methods in the real world...| muratbuffalo.blogspot.com
Joint work with Will Schultz.| Metadata
I'm catching up on Phil Eaton's book club and just finished the preface and Chapter 1 of Concurrency Control and Recovery in Database Systems by Bernstein, Hadzilacos, and Goodman.| Metadata
This Sunday, I'll be attending (and speaking at) the TLA+ Community Event , co-located with ETAPS 2025 in Hamilton, Ontario. The setting i...| muratbuffalo.blogspot.com
This EuroSys 2025 paper wrestles with the messy interface between formal specification and implementation reality in distributed systems. T...| muratbuffalo.blogspot.com
Entrepreneurs think and act differently from managers and strategists. This 2008 paper argues that entrepreneurs use effectual reasoning, t...| muratbuffalo.blogspot.com
This paper (2021) dives deeper in the Parallel Raft protocol introduced with PolarFS. PolarFS (VLDB'18) is a distributed file system with ...| muratbuffalo.blogspot.com
This paper (NSDI'25) applies lightweight formal methods (hence the pun "smart casual" in contrast to formal attire) to the Confidential Con...| muratbuffalo.blogspot.com
I have been reviewing papers for USENIX ATC and handling work stuff at MongoDB Research. I cannot blog about either yet. So, instead of a pa...| muratbuffalo.blogspot.com
This paper , presented in the industry track of ICDE 2024 , introduces GaussDB-Global (GlobalDB) , Huawei's geographically distributed datab...| muratbuffalo.blogspot.com
Distributed systems are characterized by nodes executing concurrently with no shared state and no common clock. Coordination between nodes a...| muratbuffalo.blogspot.com
This master's thesis at Lund University Sweden explores how CockroachDB 's transactional performance can be improved by using tightly synch...| muratbuffalo.blogspot.com
Incremental computation represents a transformative (!) approach to data processing. Instead of recomputing everything when your input chang...| muratbuffalo.blogspot.com
I have been teaching a TLA+ miniseries inside AWS. I just finished the 10th week, with a one hour seminar each week. I wanted to pen down my...| muratbuffalo.blogspot.com