Replication slots in Postgres keep track of how far consumers have read a replication stream. After a restart, consumers—either Postgres read replicas or external tools for change data capture (CDC), like Debezium—resume reading from the last confirmed log sequence number (LSN) of their replication slot. The slot prevents the database from disposing of required log segments, allowing safe resumption after downtime. In this post, we are going to take a look at why Postgres replication s...| www.morling.dev
Over the last couple of years, I’ve helped dozens of users and organizations to build Change Data Capture (CDC) pipelines for their Postgres databases. A key concern in that process is setting up and managing replication slots, which are Postgres' mechanism for making sure that any segments of the write-ahead log (WAL) of the database are kept around until they have been processed by registered replication consumers. When not being careful, a replication slot may cause unduly large amounts ...| www.morling.dev
Update March 27: This post is being discussed on Hacker News For building a system of distributed services, one concept I think is very valuable to keep in mind is what I call the synchrony budget: as much as possible, a service should minimize the number of synchronous requests which it makes to other services.| www.morling.dev
In the "Let’s Take a Look at…!" blog series I am going to explore interesting projects, developments and technologies in the data and streaming space. This can be KIPs and FLIPs, open-source projects, services, and more. The idea is to get some hands-on experience, learn about potential use cases and applications, and understand the trade-offs involved. If you think there’s a specific subject I should take a look at, let me know in the comments below! That guy above? Yep, that’s me, w...| www.morling.dev
In many applications it’s a requirement to keep track of when a record was created and updated the last time. Often, this is implemented by having columns such as created_at and updated_at within each table. To make things as simple as possible for application developers, the database itself should take care of maintaining these values automatically when a record gets inserted or updated.| morling.dev -- Blog
While working on a demo for processing change events from Postgres with Apache Flink, I noticed an interesting phenomenon: A Postgres database which I had set up for that demo on Amazon RDS, ran out of disk space. The machine had a disk size of 200 GiB which was fully used up in the course of less than two weeks. Now a common cause for this kind of issue are replication slots which are not advanced: in that case, Postgres will hold on to all WAL segments after the latest log sequence number (...| www.morling.dev
Update Feb 5: This post is discussed on Hacker News Reading a blog post about what’s coming up in JDK 16 recently, I learned that one of the new features is support for Unix domain sockets (JEP 380). Before Java 16, you’d have to resort to 3rd party libraries like jnr-unixsocket in order to use them. If you haven’t heard about Unix domain sockets before, they are "data communications [endpoints] for exchanging data between processes executing on the same host operating system". Don’t ...| www.morling.dev