The standard use of Raft is for implementing a fault tolerant, replicated state machine by means of a replicated log, maintained at each server within a replication group. Depending on the nature of the state we want to replicate, we can employ a simpler variant of Raft that achieves the same essential correctness properties. We can call this logless Raft and it can be useful when we are only replicating a single, small piece of state (e.g. configuration, metadata, etc.) between servers.| William Schultz
Welcome back to the second part of the series on Druid ingestion internals. In my previous post Demystifying Druid - Streaming ingestion internals - Supervisors we went through the internals of a Supervisor. As discussed previously, the supervisor is just an orchestrator that manages the ingestion tasks that eventually read from the streams. In this one, I want to dive deeper into what these tasks look like and what makes them ingest rows after rows of data.| Uddeshya’s Musings
The difference between shipped and operated software is the difference between something you can run and forget, and something that demands ongoing, hands-on care. Choosing the former protects your team’s focus and sanity.| Pierre Zemb's Blog
Two podcast episodes—one from Oxide and one from Antithesis—on debugging at the limits and building correctness into systems from day one.| Pierre Zemb's Blog
Long time no see :) I believe this will be first post in almost 3 months.| Uddeshya’s Musings
Other posts from this series: * An introduction to state-based CRDTs * Optimizing state-based CRDTs (part 1) [https://www.bartoszsypytkowski.com/optimizing-state-based-crdts-1/] * Optimizing state-based CRDTs (part 2) [https://www.bartoszsypytkowski.com/optimizing-state-based-crdts-part-2/] * State-based CRDTs: BoundedCounter [https://www.bartoszsypytkowski.com/state-based-crdts-bounded-counter/] * State-based CRDTs: Map [https://www.bartoszsypytkowski.com/crdt-map/] * Operation-| Bartosz Sypytkowski
Originally, I didn't want to make a separate blog post about design behind bounded counters, but since beside original paper [https://pages.lip6.fr/syncfree/attachments/article/59/boundedCounter-white-paper.pdf] and a very few implementation living in the wild, this CRDT is widely unknown, I've decided to give it a| Bartosz Sypytkowski
In this blog post we'll continue on topic of operation-based CRDTs and focus on the optimizations and approach known as pure operation-based CRDTs [https://hal.inria.fr/hal-01287738/document]: how can we use them to address some of the problems related to partially ordered event logs and optimize size of| Bartosz Sypytkowski
Last time [https://www.bartoszsypytkowski.com/operation-based-crdts-protocol/] we started our operation-based CRDTs sub-series, as we moved away from state-based CRDTs. We talked mostly about core requirements and sample implementation of RCB (Reliable Causal Broadcast) protocol, which was necessary to provide guarantees required by Commutative Replicated Data Types. Today we'll continue| Bartosz Sypytkowski
Today we're going to cover how to build a complex, JSON-like document CRDT. In the past, we focused on homogeneous data types like registers, sets, arrays or maps. This time we're going to combine them all and tackle some of the challenges that this approach presents. Other blog posts from| Bartosz Sypytkowski
In this blog post, we'll cover the idea of CRDT maps, and how we could create them and utilize them in common scenarios. A prerequisite for this talk is some general knowledge of CRDTs, especially observed-remove sets [https://www.bartoszsypytkowski.com/optimizing-state-based-crdts-part-2/#deltaawareaddwinsobservedremoveset] , which were already covered on this blog| Bartosz Sypytkowski
This time we're going to cover a new implementation of persistent key value store, using Conflict-free Replicated Data Types (CRDTs) to enable multi-process writes. Moreover, this approach enables shift of replication protocol from custom made gossip servers into passive replicated storage such as iCloud or Google Drive. Why?| Bartosz Sypytkowski
Today we're going to jump into LSeq - one of the famous text-editing conflict-free replicated data types we briefly introduced in the past blog posts. This time we're aiming for fixing one of the popular issues that this algorithm struggles with. We're going step| Bartosz Sypytkowski
Today we'll explain how do modern databases allow us to perform backups without blocking - thus enabling users to operate on them while the backup is beign made continuously in the background. We'll also show how this approach allow us to restore database to any point| Bartosz Sypytkowski
In this blog post we're going to cover a concepts and implementation behind collaborative 2-dimensional tables, with set of operation that could make them useful to work as spread sheets - popular in products like MS Excel or Google Sheets. There are many open source and commercial alternatives| Bartosz Sypytkowski
This blog post is a short summary of ecosystem and capabilities of Yrs (read: wires): a fully-compatible Rust port of Yjs library used to build collaborative peer-to-peer applications thanks to the power of Conflict-free Replicated Data Types. They are being used and adopted in many different products such as Jupyter| Bartosz Sypytkowski
Today we're going to continue exploration of Conflict-free Replicated Data Types domain. This time we'll start designing protocols that focus on a security aspects as first class citizens. Managing security and permissions in peer-to-peer systems may be quite cumbersome as often there's no single| Bartosz Sypytkowski
Some time ago, we covered an idea behind HyParView, a cluster membership protocol that allowed for very fast and scalable cluster construction. It did so by using the concept of partial view: while our cluster could be build out of thousands of nodes, each one of them would only be| Bartosz Sypytkowski
In this blog post we'll define the basics for a move operation used by Conflict-free Replicated Data Types (CRDT), which allows us to reorder the items in the collection. We'll define the general approach and go through example implementation build on top of YATA sequences. What&| Bartosz Sypytkowski
In the past we already discussed how to build JSON-like Conflict-free Replicated Data Type. We used operation-based approach, with wide support for many operations and characteristics of a dynamically typed recursive documents. However with that power came complexity. Today we'll discuss much simpler approach, which some may find| Bartosz Sypytkowski
A technical deep-dive into FoundationDB Record Layer continuations, explaining how they enable long-running operations by segmenting work across multiple FDB transactions, effectively bypassing the inherent 5-second and 10MB limits.| Pierre Zemb's Blog
At Datadog, we regularly hold hackathons, a dedicated time when we can explore new ideas and tinker with new technologies. During one of these hackathons, I found myself working side by side with a colleague who holds a Data Mining & Algorithms PhD. Driven by the desire to do something both cool and complex, we decided on building an online anomaly detection method for streaming logs. We both work in the Cloud SIEM team, a team that provides a security tool to analyse logs in a stateful manne...| Adri’s Blog
Introduction I was reading this small summary of 3FS architecture in this blog - 3FS Performance Journal-1. In my opinion, it’s a pretty neat piece of work, and it mentioned that “Management Server” component kept track of all nodes addresses.| Uddeshya’s Musings
Other posts from this series: * An introduction to state-based CRDTs [https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/] * Optimizing state-based CRDTs (part 1) * Optimizing state-based CRDTs (part 2) [https://www.bartoszsypytkowski.com/optimizing-state-based-crdts-part-2/] * State-based CRDTs: BoundedCounter [https://www.bartoszsypytkowski.com/state-based-crdts-bounded-counter/] * State-based CRDTs: Map [https://www.bartoszsypytkowski.com/crdt-map/] * Operatio| Bartosz Sypytkowski
Last time [https://www.bartoszsypytkowski.com/operation-based-crdts-arrays-1/] we were discussing how to build a Commutative Replicated Data Types operating as indexed sequences - preserving order of inserted elements - using two different data structures: Linear Sequences (LSeq) and Replicated Growable Arrays (RGA). In this blog post we'll continue the topic| Bartosz Sypytkowski
In this post, we'll continue onto topic of Commutative Replicated Data Types. We already mentioned [https://www.bartoszsypytkowski.com/operation-based-crdts-registers-and-sets/#sets] how to prepare first, the most basic types of collections: sets. This time we'll go take a look at indexed sequences with add/remove operations. Other blog posts from| Bartosz Sypytkowski
Today we'll continue a series about CRDTs, this time however we'll stray from the path of state-based CRDTs and start talking about their operation-based relatives. The major difference that we need to cover, is the center of gravity of this approach: the replication protocol. Other blog posts from this series:| Bartosz Sypytkowski
Introduction| Uddeshya’s Musings
This post is migrated version of the one already published on Medium.| Uddeshya’s Musings
My attempt to decipher the Raft whitepaper and how KRaft implementation adheres to the raft philosophy and techniques.| Uddeshya’s Musings
My personal website.| William Schultz
If we want to formally prove that a system satisfies some safety property (i.e. an invariant), we can do this by finding an inductive invariant. An inductive invariant is a particular type of invariant that is at least as strong as the target invariant to be proven, and is also inductive, meaning that it is closed under all transitions of the system.| William Schultz
Below is a high level overview of the Raft reconfiguration bug cases laid out in Diego Ongaro’s group post, which described the problematic scenarios in Raft’s single server reconfiguration (i.e. membership change) algorithm. Configurations are annotated with their terms i.e., a config \(X\) in term \(t\) is shown as \(X^t\). One add, one remove Two adds Two removes I view all of these bug cases as instances of a common problem related to config management when logs diverge (i.e., when th...| William Schultz
Learn how different distributed databases handle secondary indexes, and the benefits and drawbacks of each approach.| DeBrie Advisory Blog
In a message-based system, we might feel a lack of control, especially when in need of compensating changes spread across the system. Fear not! Real life dea...| Mauro Servienti - Milestone
There are scenarios when a chatty services relationship seems the only option, with the results of coupling quickly becoming our best friend. Not all hope is...| Mauro Servienti - Milestone
__RETRY = "::RETRY::"| notes.abhinavsarkar.net
CI pipeline of abhinavsarkar.net| notes.abhinavsarkar.net
After series of 11 blogs posts about Conflict-free Replicated Data Types, it's time to wrap up. This time let's discuss various optimizations that could be applied to CRDTs working at higher scale. Other blog posts from this series: * An introduction to state-based CRDTs [https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/] * Optimizing state-based| Bartosz Sypytkowski
In this blog post we're coming back to indexed sequence CRDTs - we already discussed some operation-based approaches in the past. This time we'll cover YATA (Yet Another Transformation Approach): a delta-state based variant, introduced [https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=| Bartosz Sypytkowski