If you've been in the data space long enough, you would have come across really long SQL scripts that someone had written years ago. However, no one dares to touch them, as they may be powering some important part of the data pipeline, and everyone is scared of accidentally breaking them. If you feel > Rough SQL is a good place to start, but it cannot scale after a certain limit > That dogmatic KISS approach leads to unmaintainable systems > The simplest solution that takes the shortest time ...| www.startdataengineering.com
Working on a large codebase without any tests can be nerve-wracking. One wrong line of code or an in-conspicuous library update can bring down your whole production pipeline! Data pipelines start simple, so engineers skip tests, but the complexity increases rapidly after a while, and the lack of tests can grind down your feature delivery speed. It can be especially tricky to start testing if you are working on a large legacy codebase with few to no tests. In long-running data pipelines, bad c...| www.startdataengineering.com
If you are trying to improve your data engineering skills or are the sole data person in your company, it can be hard to know how well your technical skills are developing. Questions like Am I building pipelines the right way? How do I measure up to DEs at bigger tech companies? How do I get feedback on my pipeline design? It can cause a lot of uncertainty in career development! Imagine if you know that your code is on par (or even better than) with pipelines at tech-forward companies and tha...| www.startdataengineering.com