Querying billions of weather records and getting results in under 200 milliseconds isn’t theory; it’s what real-time analytics solutions provide. Processing streaming IoT data from thousands of sensors while delivering real-time dashboards with no lag is what certain business domains need. That’s what you’ll learn at the end of this guide through building a ClickHouse-modeled analytics use case. You’ll learn how to land data in ClickHouse that is optimized for real-time data applica...| Data Engineering Blog
Many ask themselves, “Why would I use a semantic layer? What is it anyway?” In this hands-on guide, we’ll build the simplest possible semantic layer using just a YAML file and a Python script—not as the goal itself, but as a way to understand the value of semantic layers. We’ll then query 20 million NYC taxi records with consistent business metrics executed using DuckDB and Ibis. By the end, you’ll know exactly when a semantic layer solves real problems and when it’s overkill.| Data Engineering Blog
A crafted digital vault of curated knowledge where insights and ideas connect and compound, modeled after a Zettelkasten to inspire learning and discovery.| www.ssp.sh
Learning to Think Again, and the Cost of AI Dependency. There are so many (hype/boring) posts about AI coming out every day. It’s OK to use it, and everyone does it, but still learn your craft, and try to think.| www.ssp.sh
A semantic layer is a translation layer that sits between your data and your business users, converting complex data into understandable business concepts. Let's discover its history, trends, and if it's here to stay.| Data Engineering Blog
I switched my five-year-old MacBook Pro M1 Max for a cheap (comparable) Lenovo ThinkBook 14 G7 ARP (AMD) laptop, running Linux (Arch btw, or better, Omarchy. And I am having a blast. But not everything is perfect. But let’s not get ahead of ourselves. This is a short recap after using it for one month on and off (due to repair 😅), and the last 2 weeks full time. I want to share what I learned, what I like about the new setup after working for 15 years plus on a MacBook, and on and off on...| Data Engineering Blog
A comprehensive 3-week roadmap covering SQL, Python, cloud platforms, data modeling, and DevOps essentials for aspiring and practicing data engineers| Data Engineering Blog
A journey through my thoughts, my learning through studying Stoicism, life experiences, going to church, my family and friends and my life circumstances, through reading the Bible and being open-minded, to answer the question of what the meaning of life is for me. What are just distractions versus what are valuable things to learn, and ultimately, finding purpose in this short life we have.| Data Engineering Blog
Remember when data scientists spent 80% of their time wrestling with data wrangling instead of building models? I’d argue that today’s data engineers face similar challenges, but with the added complexity of infrastructure setup. We’re architects of entire data ecosystems, orchestrating everything from real-time pipelines to AI workflows. The secret? Infrastructure as Code and DevOps principles that transform scattered server management into elegant, declarative configurations. The catc...| Data Engineering Blog
Looking at your comprehensive article about conversational BI and MCP, here's an SEO-optimized description:| Data Engineering Blog
Moving from orchestration theory to the enterprise level is a real challenge. How do you handle secrets across environments? Where does your business logic actually live? How do you make pipelines that work for both your senior engineers and the analysts who need to modify them? In Part 1, The Heartbeat of Data Engineering, we discussed the convergent orchestrator combining orchestration as code and no-code. The dual interface with an advanced UI but with the generated code as YAML helps auto...| Data Engineering Blog
Discover how Apache Iceberg, DuckDB, and open catalogs transform data lakes into powerful lakehouses. Learn to query S3 data with SQL interfaces.| Data Engineering Blog
Why I self-host my websites, newsletter, and homelab—and the satisfaction that comes from building and using your own digital tools.| Data Engineering Blog
Discover how declarative data stacks are transforming modern data engineering workflows. Explore the evolution from imperative to declarative systems, and learn how tools like Kubernetes, dbt, and Airflow are shaping the future of data architecture.| Data Engineering Blog
How (Neo)Vim and Markdown revolutionized my data engineering and writing workflow.| Data Engineering Blog
A Vim-Inspired Approach to Efficient Note Management with Obsidian and Markdown| Data Engineering Blog