Are you a data engineer(or new to data space) wondering why one may need to use Apache Airflow vs. just using cron? Does Apache Airflow feel like an over-optimized solution for a simple problem? Then this post is for you. Understanding the critical features necessary for a data pipelining system will ensure that your output is high quality! Imagine knowing exactly what a complex orchestration system brings to the table; you can make the right tradeoffs for your data architecture. This post wi...| www.startdataengineering.com
You know Python is essential for a data engineer. Does anyone know how much one should learn to become a data engineer? When you're in an interview with a hiring manager, how can you effectively demonstrate your Python proficiency? Imagine knowing exactly how to build resilient and stable data pipelines (using any language). Knowing the foundational ideas for data processing will ensure you can quickly adapt to the ever-changing tools landscape. In this post, we will review the concepts you n...| www.startdataengineering.com
Data engineering project for beginners, using AWS Redshift, Apache Spark in AWS EMR, Postgres and orchestrated by Apache Airflow.| www.startdataengineering.com
If you are trying to improve your data engineering skills or are the sole data person in your company, it can be hard to know how well your technical skills are developing. Questions like Am I building pipelines the right way? How do I measure up to DEs at bigger tech companies? How do I get feedback on my pipeline design? It can cause a lot of uncertainty in career development! Imagine if you know that your code is on par (or even better than) with pipelines at tech-forward companies and tha...| www.startdataengineering.com