The ETL & ELT tool market is experiencing continuous transformation, propelled by fluctuating pricing structures and the advent of inventive alternatives. This industry remains fiercely competitive due to these changing elements and a swiftly growing user base. In the following sections, we will explore four emerging alternatives to Fivetran. Of course, that is if you… Read more| Seattle Data Guy
Photo by Pawel Czerwinski on Unsplash Big data is touted as the solution to many problems. However, the truth is, big data has caused a lot of big problems. Yes, big data can provide more context and provide insights we have never had before. It also makes queries slow, data expensive to manage, requires a lot of expensive… Read more| Seattle Data Guy
Data integration is critical for organizations of all sizes and industries—and one of the leading providers of data integration tools is Talend, which offers the flagship product Talend Studio. In 2023, Talend was acquired by Qlik, combining the two companies’ data integration and analytics tools under one roof. In January 2024, Talend discontinued Talend Open… Read more The post Alternatives to Talend – How To Migrate Away From Talend For Your Data Pipelines appeared first on Seattle...| Seattle Data Guy
Document Intelligence Studio is a data extraction tool that can pull unstructured data from diverse documents, including invoices, contracts, bank statements, pay stubs, and health insurance cards. The cloud-based tool from Microsoft Azure comes with several prebuilt models designed to extract data from popular document types. However, you can also use labeled datasets to train… Read more| Seattle Data Guy
Looking for Data science and engineering consultants! Our team is built up of data scientists, software developers and analyst ready to help design and deploy| Seattle Data Guy
Maybe you’re luckier than me. Maybe you’ve never opened a .sql file or an Airflow DAG only to be greeted by a 5,000+ line query…a true monster of a script that leaves you wondering where to begin. I’ve seen plenty of these, and every time, I ask myself: Why in the world do these exist? And, more… Read more| Seattle Data Guy
After working in data for over a decade, one thing that remains the same is the need to create data pipelines. Whether you call them ETLs/ELTs or something else, companies need to move and process data for analytics. The question becomes how companies are actually building their data pipelines. What ETL tools are they actually… Read more| Seattle Data Guy
Over the past three years our teams have noticed a pattern. Many companies looking to migrate to the cloud go from SQL Server to Snowflake. There are many reasons this makes sense. One of the reasons and common benefits was that teams found it far easier to manage that SQL Server and in almost every… Read more| Seattle Data Guy
If you work in data, then you’ve likely used BigQuery and you’ve likely used it without really thinking about how it operates under the hood. On the surface BigQuery is Google Cloud’s fully-managed, serverless data warehouse. It’s the Redshift of GCP except we like it a little more. The question becomes, how does it work?… Read more| Seattle Data Guy
Planning out your data infrastructure in 2025 can feel wildly different than it did even five years ago. The ecosystem is louder, flashier, and more fragmented. Everyone is talking about AI, chatbots, LLMs, vector databases, and whether your data stack is “AI-ready.” Vendors promise magic, just plug in their tool and watch your insights appear.… Read more| Seattle Data Guy
Since I started working in tech, one goal that kept coming up was workflow automation. Whether automating a report or setting up retraining pipelines for machine learning models, the idea was always the same: do less manual work and get more consistent results. But automation isn’t just for analytics. RevOps teams want to streamline processes… Read more| Seattle Data Guy
PDF files are one of the most popular file formats today. Because they can preserve the visual layout of documents and are compatible with a wide range of devices and operating systems, PDFs are used for everything from business forms and educational material to creative designs. However, PDF files also present multiple challenges when it… Read more| Seattle Data Guy
If you’re looking to pass hundreds of GBs of data quickly, you’re likely not going to use a REST API. That’s why every day, companies share data sets of users, patient claims, financial transactions, and more via SFTP. If you’ve been in the industry for a while, you’ve probably come across automated SFTP jobs that… Read more| Seattle Data Guy
Scraping data from PDFs is a right of passage if you work in data. Someone somewhere always needs help getting invoices parsed, contracts read through, or dozens of other use cases. Most of us will turn to Python and our trusty list of Python libraries and start plugging away. Of course, there are many challenges… Read more| Seattle Data Guy
Photo by Austin Distel on Unsplash Success in the data world hinges on team setup. I’ve delved into onboarding and standards in previous articles, but never into the structure of data teams. Typically, there are three configurations: Centralized, Decentralized, and Federated. Most companies I’ve seen use a mix of these. While the newest tech breakthroughs… Read more| Seattle Data Guy
Billions of dollars have been put into investing into companies that fall under the concept of “Modern Data Stack. Fivetran nearly has one billion dollars funding them, DBT has 150 million(and is looking to raise more), Starburst has 100 million…and I could really go on and on about all the companies being funded. So that means every company has… Read more| Seattle Data Guy
If you’re a data engineer, then you’ve likely at least heard of Airflow. Apache Airflow is one of the most popular open-source workflow orchestration solutions that gets used for data pipelines. This is what spurred me to write the article “Should You Use Airflow” because there are plenty of people who don’t enjoy Airflow or… Read more| Seattle Data Guy
Photo by Leif Christoph Gottwald on Unsplash A few months ago, I uploaded a video where I discussed data warehouses, data lakes, and transactional databases. However, the world of data management is evolving rapidly, especially with the resurgence of AI and machine learning. There are numerous other methods that technical teams are utilizing to handle… Read more| Seattle Data Guy
Why Invest In A Data Warehouse We recently had a medium sized company ask us why they might want to build a data warehouse. (If you are curious about what a data warehouse, feel free to read here) They had an operational relational database and a few ideas for products that we found brilliant. We… Read more| Seattle Data Guy