Through this quickstart guide, you will explore Snowpark Container Services. You will learn the basic mechanics of working with Snowpark Container Services and build several introductory services. Please note: this quickstart assumes some existing knowledge and familiarity with containerization (e.g. Docker) and basic familiarity with container orchestration.| quickstarts.snowflake.com
By: Kevin Kimani Payment gateways like Rapyd help businesses seamlessly handle transactions across different platforms and currencies. Rapyd allows businesses to receive, manage, and disburse funds through a range of payment methods. Its flexible platform helps businesses simplify their financial setup and improve the customer experience. It鈥檚 ideal for online marketplaces, e-commerce stores, online trading, subscription-based platforms, and cross-border tr...| Rapyd Community
Report an issueopen_in_new| Bazel
An in-significant data project portfolio can help set you apart from the run-of-a-mill candidate. Projects show that you are someone who can learn and adapt. Your portfolio informs a potential employer about your ability to continually learn, your knowledge of data pipeline best practices, and your genuine interest in the data field. Most importantly, it gives you the confidence to pick up new tools and build data pipelines from scratch. But setting up data infrastructure, with coding best pr...| www.startdataengineering.com
Why Git? Git is a postmodern version control system that offers the familiar capabilities...| Goglides Dev 馃尡
First-Time Git Setup| git-scm.com
Are you part of an under-resourced team where adding time-saving dbt (data build tool) features take a back seat to delivering new datasets? Do you want to incorporate time (& money) saving dbt processes but need more time? While focussing on delivery may help in the short term, the delivery speed will suffer without proper workflow! A good workflow will save time, prevent bad data, and ensure high development speed! Imagine the time (& mental pressure) savings if you didn't have to validate ...| www.startdataengineering.com
The commmand line can be your best friend during development, if you take some time to get to know its capabilities. But even if you know what it can help you with, you can still extend it. I鈥檇 like to introduce you some of my favourite tools, fzf, and oh-my-zsh. By the end of this post I hope you鈥檒l see how well they can work together with you.| M谩ty谩s Budav谩ri
Host your site on GitHub Pages with continuous deployment using project, user, or organization pages.| gohugo.io
Stream processing differs from batch; one needs to be mindful of the system's memory, event order, and system recovery in case of failures. However, understanding the fundamental concepts of time attributes, cluster memory, time-bounded joins, and system monitoring will enable you to build resilient and efficient streaming pipelines. If you are looking for an end-to-end streaming tutorial or a project to understand the foundational skills required to build streaming pipelines, this post is fo...| www.startdataengineering.com
Setting up data infra is one of the most complex parts of starting a data engineering project. Overwhelmed trying to set up data infrastructure with code? Or using dev ops practices such as CI/CD for data pipelines? In that case, this post will help! This post will cover the critical concepts of setting up data infrastructure, development workflow, and sample data projects that follow this pattern. We will also use a data project template that runs Airflow, Postgres, & Metabase to demonstrate...| www.startdataengineering.com
Struggling with setting up a local development environment for your python data projects? Then this post is for you! In this post, you will learn how to set up a local development environment for data projects using docker. By the end of this post, you will know how to set up your local development environment the right way with docker. You will be able to increase developer ergonomics, increase development velocity and reduce bugs.| www.startdataengineering.com
Data engineering project for beginners, using AWS Redshift, Apache Spark in AWS EMR, Postgres and orchestrated by Apache Airflow.| www.startdataengineering.com
Struggling to come up with a data engineering project idea? Overwhelmed by all the setup necessary to start building a data engineering project? Don't know where to get data for your side project? Then this post is for you. We will go over the key components, and help you understand what you need to design and build your data projects. We will do this using a sample end-to-end data engineering project.| www.startdataengineering.com
Worried about introducing data pipeline bugs, regressions, or introducing breaking changes? Then this post is for you. In this post, you will learn what CI is, why it is crucial to have data tests as part of CI, and how to create a CI pipeline that automatically runs data tests on pull requests using Github Actions.| www.startdataengineering.com
Frustrated that hiring managers are not reading your Github projects? then this post is for you. In this post, we discuss a way to impress hiring managers by hosting a live dashboard with near real-time data. We will also go over coding best practices such as project structure, automated formatting, and testing to make your code professional. By the end of this post, you will have deployed a live dashboard that you can link to your resume and LinkedIn.| www.startdataengineering.com
Wondering how to execute a spark job on an AWS EMR cluster, based on a file upload event on S3? Then this post if for you. In this post we go over how to trigger spark jobs on an AWS EMR cluster, using AWS Lambda. The lambda function will execute in response to an S3 upload event. We will go over this event driven pattern with code snippets and set up a fully functioning pipeline.| www.startdataengineering.com
There are many ways to submit an Apache Spark job to an AWS EMR cluster using Apache Airflow. In this post we go over the steps on how to create a temporary EMR cluster, submit jobs to it, wait for the jobs to complete and terminate the cluster, the Airflow-way.| www.startdataengineering.com
DBT (data build tool) tutorial. Build a project simulating a real life ELT project using the data build tool.| www.startdataengineering.com
Using dbt you can test the output of your sql transformations. If you have wondered how to "unit test" your sql transformations in dbt, then this post is for you. In this post, we go over how to write unit tests for your sql transformations with mock inputs/outputs and test them locally. This helps keep the development cycle shorter and enables you to follow a TDD approach for your sql based data pipelines.| www.startdataengineering.com
How to download, install, and start using the Heroku CLI. The Heroku CLI used to be part of the Heroku Toolbelt.| devcenter.heroku.com
Setting up a local development environment with the SvelteKit demo app and Prisma ORM| DockYard