Are you disappointed with online SQL tutorials that aren't deep enough? Are you frustrated knowing that you are missing SQL skills, but can't quite put your finger on it? This post is for you. In this post, we go over a few topics that can take your SQL skills to the next level and help you be a better data engineer.| www.startdataengineering.com
Working with a dataset that is too large to fit in memory? Then this post is for you. In this post, we will write memory efficient data pipelines using python generators. We also cover the common generator patterns you will need for your data pipelines.| www.startdataengineering.com
Wondering how to execute a spark job on an AWS EMR cluster, based on a file upload event on S3? Then this post if for you. In this post we go over how to trigger spark jobs on an AWS EMR cluster, using AWS Lambda. The lambda function will execute in response to an S3 upload event. We will go over this event driven pattern with code snippets and set up a fully functioning pipeline.| www.startdataengineering.com
Confused by all the tools and frameworks available to scale your data pipeline? Then this post is for you. In this post, we go over what scaling is, the different types of scaling, and how to choose scaling strategies for your data pipelines. By the end of this post, you will be able to come up with the correct scaling strategy for any data pipeline.| www.startdataengineering.com