Efficient Data Processing in SQLA guide to understanding the core concepts of distributed data storage & processing, analytical functions, and query optimizations in your data warehouse.You want to be able to write efficient data processing pipelines in SQL, but you don't know where to start!There are too many topics to learn to get proficient at efficient data processing in SQL, like optimizing queries, partitioning, parallelism, data modeling, best practices, etc. It is overwhelming to have...| Gumroad
Learn how to install Docker Compose. Compose is available natively on Docker Desktop, as a Docker Engine plugin, and as a standalone tool.| Docker Documentation
Learn how to choose the best method for you to install Docker Engine. This client-server application is available on Linux, Mac, Windows, and as a static binary.| Docker Documentation
Are you disappointed with online SQL tutorials that aren't deep enough? Are you frustrated knowing that you are missing SQL skills, but can't quite put your finger on it? This post is for you. In this post, we go over a few topics that can take your SQL skills to the next level and help you be a better data engineer.| www.startdataengineering.com
In this post, we go over 6 key concepts to help you master window functions. Window functions are one the most powerful features of SQL, they are very useful in analytics and performing operations that cannot be done easily with the standard group by, subquery and filters. Despite this, window functions are not used frequently. If you have ever thought 'window functions are confusing', then this post is for you.| www.startdataengineering.com
Change data capture is a popular technique to copy data from DBs into warehouses. However, it can be tricky to understand at first. Without working with a CDC system, knowing what it does, why it's needed, or how it works can be challenging. However, understanding the what, why, and how of CDC can help you set up pipelines that are resilient and reliable. If you have wondered what CDC does, why it's needed, and how it works, this post is for you. By the end of this post, you will have a good ...| www.startdataengineering.com
Data pipelines built (and added on to) without a solid foundation will suffer from poor efficiency, slow development speed, long times to triage production issues, and hard testability. What if your data pipelines are elegant and enable you to deliver features quickly? An easy-to-maintain and extendable data pipeline significantly increase developer morale, stakeholder trust, and the business bottom line! Using the correct design pattern will increase feature delivery speed and developer valu...| www.startdataengineering.com
Confused by all the tools and frameworks available to scale your data pipeline? Then this post is for you. In this post, we go over what scaling is, the different types of scaling, and how to choose scaling strategies for your data pipelines. By the end of this post, you will be able to come up with the correct scaling strategy for any data pipeline.| www.startdataengineering.com
Installing from Source| git-scm.com