Every data engineering interview includes a SQL round. If you are: > Worried about job descriptions asking for advanced SQL, but you are not sure what advanced SQL means for an interview > Having anxiety about being unable to get a job > Frustrated with online SQL courses teaching the basic dialects, but not a step-by-step approach to problem solving If so, this post is for you. Imagine being able to dissect any SQL problem and make the interviewer say, I need this person on my team. That is ...| www.startdataengineering.com
System design interviews are usually vague and depend on you (as the interviewee) to guide the interviewer. If you are thinking: How do I prepare for data engineering system design interviews? I struggle to think of questions you would ask in a system design interview for data engineering; I don't have enough interview experience to know what companies ask. Is data engineering "system design" more than choosing between technologies like Spark and Airflow? This post is for you! Imagine being a...| www.startdataengineering.com
Introduction Setup SQL tips 1. Handy functions for common data processing scenarios 1.1. Need to filter on WINDOW function without CTE/Subquery use QUALIFY 1.2. Need the first/last row in a partition, use DISTINCT ON 1.3. STRUCT data types are sorted based on their keys from left to right 1.4. Get the first/last element with ROW_NUMBER() + QUALIFY 1.5. Check if at least one or all boolean values are true with BOOL_OR & BOOL_AND respectively 1.| Start Data Engineering
Preparing for data engineering interviews can be stressful. There are so many things to learn. In this 'Data Engineering Interview Series', you will learn how to crack each section of the data engineering interview. If you have felt > That you need to practice 100s of Leetcode questions to crack the data engineering interview > That you have no idea where/how to start preparing for the data structures and algorithms interview > That you are not good enough to crack the data structures and alg...| www.startdataengineering.com
Do you use SQL or Python for data processing? Every data engineer will have their preference. Some will swear by Python, stating that it's a Turing-complete language. At the same time, the SQL camp will restate its performance, ease of understanding, etc. Not using the right tool for the job can lead to hard-to-maintain code and sleepless nights! Using the right tool for the job can help you progress the career ladder, but every advice online seems to be 'Just use Python' or 'Just use SQL.' U...| www.startdataengineering.com
You know Python is essential for a data engineer. Does anyone know how much one should learn to become a data engineer? When you're in an interview with a hiring manager, how can you effectively demonstrate your Python proficiency? Imagine knowing exactly how to build resilient and stable data pipelines (using any language). Knowing the foundational ideas for data processing will ensure you can quickly adapt to the ever-changing tools landscape. In this post, we will review the concepts you n...| www.startdataengineering.com
Are you disappointed with online SQL tutorials that aren't deep enough? Are you frustrated knowing that you are missing SQL skills, but can't quite put your finger on it? This post is for you. In this post, we go over a few topics that can take your SQL skills to the next level and help you be a better data engineer.| www.startdataengineering.com
You have heard of Common Table Expressions(CTEs), but are not be sure what they are and when to use them. What if you knew exactly what Common Table Expressions(CTEs) were and when to use them? In this post, we go over what CTEs are, and their performance comparisons against subqueries, derived tables, and temp tables to help decide when to use them.| www.startdataengineering.com
Wondering how to store a dimension table's history over time and how to join these historical dimension tables with fact tables for analytical querying ? Then this post is for you. In this post, we will go over a popular dimension modeling technique called SCD2, which preserves historical changes. We will also see how to join a fact table with an SCD2 table to get accurate point in time information.| www.startdataengineering.com