Data engineering project for beginners, using AWS Redshift, Apache Spark in AWS EMR, Postgres and orchestrated by Apache Airflow.| www.startdataengineering.com
Worried about setting up end-to-end tests for your data pipelines? Wondering if they are worth the effort? Then, this post is for you. In this post, we go over some techniques to set up end-to-end tests. We will also see which components to prioritize while testing.| www.startdataengineering.com
Ensure your data meets basic and business specific data quality constraints. In this post we go over a data quality testing framework called great expectations, which provides powerful functionality to cover the most common test cases and the ability to group them together and run them.| www.startdataengineering.com
DBT (data build tool) tutorial. Build a project simulating a real life ELT project using the data build tool.| www.startdataengineering.com
Using dbt you can test the output of your sql transformations. If you have wondered how to "unit test" your sql transformations in dbt, then this post is for you. In this post, we go over how to write unit tests for your sql transformations with mock inputs/outputs and test them locally. This helps keep the development cycle shorter and enables you to follow a TDD approach for your sql based data pipelines.| www.startdataengineering.com