About Me I’m Aaron Zuspan, a research fellow at the U.S. Forest Service Pacific Northwest Research Station where I use remote sensing, machine learning, and open-source software to study post-fire forests. Sometimes I build things, and sometimes I write about: 🐍 Python 🧮 Algorithms ⚙️ Earth Engine 🔊 Audio| Signals & Pixels
Google recently released the AlphaEarth Foundations annual Satellite Embeddings dataset, a 64-dimensional vector representation of Earth’s surface that incorporates spectral reflectance, radar backscatter, climate, and topography in an analysis-ready format. Currently, the dataset is only available through Earth Engine, so if you want to work with it outside the platform, you’ll need to export it first1. Earth Engine stores the embeddings in 64-bit floating point images. Exporting as-is m...| Blog on Signals & Pixels
I wrote a recent blog post about hacking together a crude version of Git hooks in the Jujutsu VCS. While this is a helpful last resort to avoid pushing unlinted, unformatted code to a remote, Jujutsu has a much more elegant built-in tool to solve the problem of messy local commits: jj fix. Once it’s configured, the fix command runs your formatter/linter over all of your writable1 commits, applying fixes retroactively and automatically rebasing without introducing merge conflicts.| Blog on Signals & Pixels
My day job involves a lot of time spent looking at NAIP aerial imagery of burned forests. The scene below is pretty common: large swaths of high severity fire leave behind hectares of standing dead snags. While the trees themselves can be hard to make out after needles drop, the distinctive shadows that they cast are a dead giveaway. Diagonal shadows cast by standing dead snags in a burned forest.| Blog on Signals & Pixels
I have a digital photo frame running on a Raspberry Pi hooked up to a small TV, and want an easy way to control playback. I already need a remote for the TV, so why not use that? Below I’ll go through the setup process I used to turn IR commands into key presses on a Pi running Bookworm, including an Ansible playbook for automating setup. Manual Setup The Hardware The first thing we need is an IR receiver to physically sense the infrared signals broadcast by the remote.| Blog on Signals & Pixels
If you’ve lived in the US – especially an area prone to extreme weather – you’re probably uncomfortably familiar with this sound: Your browser does not support the audio element. Aside from interrupting Bob’s Burgers with the threat of imminent death, the SAME header (as it’s known) marks the beginning of an emergency broadcast, encoding data about when, where, and why an alert is being issued. With the idea that we only fear what we don’t understand, let’s take a look at how ...| Blog on Signals & Pixels
scikit-learn’s support for deep learning is pretty limited, but it does offer just enough functionality to build a usable autoencoder. It won’t give you the flexibility or performance of a GPU-backed implementation in Pytorch or Tensorflow, but if you’re working with small datasets and need a lightweight dependency, it’ll do the job. This post was inspired by a Gist by @golamSaroar that demonstrated the basic functionality of an MLP autoencoder.| Blog on Signals & Pixels
I’m working on a project1 that requires splitting a large raster into a few thousand overlapping chunks and iteratively processing and accumulating each result into an output raster. After finding my naive Xarray implementation for accumulating results was painfully slow, I decided to spend some time benchmarking different approaches. Below, I go through three different strategies that ultimately reduced processing time from a couple hours to a few minutes. Using an Outer Arithmetic Join By...| Blog on Signals & Pixels
Solar Bitflip Sort is a joke sorting algorithm that relies on cosmic particles colliding with transistors in a memory node to randomly flip an array of bits into a miraculously sorted arrangement. But how long would it actually take to sort an array? A 1996 study by IBM found that their machines experienced about 1 bit flip per 256 MB of RAM per month, or $5.5 * 10^{-9}$ flips per bit per year.| Blog on Signals & Pixels
I’ve been dabbling with Rust lately and looking for a good learning project to help ground some theory in practice. Skimming through Crash Course Computer Science gave me an idea: starting from the ground up, with bits and switches, let’s try to simulate some very basic digital circuits in Rust – latches, registers, and adders. By the end, we’ll hopefully have enough components to assemble a virtual “computer” capable of some rudimentary 4-bit arithmetic.| Blog on Signals & Pixels
eerepr is a Python package I wrote that generates HTML reprs for Earth Engine objects, converting JSON data to trees that can be navigated in a Jupyter notebook. Large objects representing tens of MBs of JSON with thousands of nested elements aren’t uncommon, so the last few months I’ve been focusing on performance optimizations within the HTML conversion process to speed up repr generation. This post details some successful and some unsuccessful experiments with optimizing the conversion...| Blog on Signals & Pixels
The MNIST dataset of hand-written digits is ubiquitous in image classification, the de facto “hello world” of 2D CNNs. It’s also famously easy – a decent model can achieve 98% accuracy with a few minutes of training. To keep things interesting, let’s try dropping a dimension. Can we make MNIST fun again by treating it as a 1D signal classification problem? Images to Signals There are a lot of ways to convert a 2D image to a 1D signal.| Blog on Signals & Pixels
I’m sick of blog posts written by LLMs. Let’s go back to the good old days, when content was procedurally generated by real algorithms. From now on, I’m writing all my blog posts using Markov chains like a 1980s Usenet bot (specifically, this one). But first I’ve got to figure out how. With an algorithm description from Wikipedia and a tentative grasp on the Rust programming language, let’s get started.| Blog on Signals & Pixels
The randomVisualizer method in Earth Engine assigns randomized colors to unique values in an image, providing a quick shortcut to visualize classified pixels without manually creating a color palette. Before and after applying color to a land cover map with randomVisualizer. The method’s convenience comes with some limitations. Because the palette is computed on-the-fly and applied directly to the image, the only way to know which colors are assigned to which values is to manually or progra...| Blog on Signals & Pixels
I’m building a domain-specific language called Arpeggio that compiles code into songs. I outlined a tentative syntax for the language in Part 1, explored the musical theory behind it in Part 2, and implemented a Python music engine to power it in Part 3. Now it’s time to finally connect those components together by writing a parser and interpreter that turns our custom language into playable music. Syntax and Semantics The language design was detailed in Part 1, but here’s a quick recap...| Blog on Signals & Pixels
I have an idea. When a satellite takes an image of Earth’s surface, the color that it sees is a function of land cover – trees, water, ice, sand. The spatial distribution of those land covers depends on everything from plate tectonics to seedling germination, but I’m betting you can make a decent guess at predicting land cover, and by extension surface color, using just climate. Cold and wet? Probably a boreal forest (green).| Blog on Signals & Pixels
I recently needed to export an image from Earth Engine to overlay with a local GeoTIFF. Translating the CRS and transforms between the local rasterio metadata and the format expected by GEE to get identical grids turned out to be surprisingly frustrating, so I thought I’d do a quick write up to hopefully save myself a future headache. Reference Metadata Assuming we have a local GeoTIFF reference.tif, we can grab the relevant metadata with:| Blog on Signals & Pixels
A few months back I posted my experiments deploying serverless Earth Engine functions using infrastructure-as-code with Pulumi. Here, I’m building on that code to add a Redis instance that will allow us to cache server-side computations, reducing the number of EE calls and speeding up responses. The Goal I’m going to build on the original demo function, which computed the cloud cover of the most recent Landsat 9 image (you can still run it here).| Blog on Signals & Pixels
UPDATE: It looks like the bug that prevented connecting to mapped network drives is fixed as of NordVPN 7.39.1.0, and this workaround now causes issues with VPN connections. If you’re still having issues with mapped drives, I suggest updating software first and only using this workaround as a last resort. After a software update on my Windows machine, I noticed that connecting NordVPN disconnected some mapped network drives on my local network.| Blog on Signals & Pixels
Vacuum tubes are an expensive, inefficient, and unreliable relic of early 1900s technology. So why are guitarists (and me) still lugging around old tube amps 80 years after the transistor made them obsolete? The short answer is that tube amps sound good when you overload their input signal, and transistor amps don’t1. So, I can’t just swap my tubes for transistors one-to-one and get the same tone in an amp. But could I replace my tubes with a few billion transistors in a GPU running an op...| Blog on Signals & Pixels
I was recently introduced to the idea of a parity drive, a specially reserved hard drive in a data server array that can be used to recover data when another drive fails. The clever thing about a parity drive is that no matter how many hard drives are in your array, you only ever need one parity drive to provide fault tolerance for all of them1. How is it possible for one hard drive to recreate data from 10 or even 100 other hard drives?| Blog on Signals & Pixels
Using evaluate with a callback function is the standard way to handle potential server-side errors in the Earth Engine JS API, e.g. parsing a date: ee.Date("2012").evaluate(function(result, err) { err ? print(err) : print(result); }); But what if we want to use promises instead? Let’s write a get_promise function that wraps any evaluatable object in a promise: /** * Return a promise that resolves when an object is evaluated. */ function get_promise(object) { return new Promise(function(reso...| Blog on Signals & Pixels
Some UI elements in Earth Engine (labels and buttons) support image icons, but the feature has some quirks that aren’t well-documented. Here’s a quick look at four ways you can implement icons in an Earth Engine app. This post was inspired by a tweet from @jstnbraaten that I have to dig up every time I want to use this feature. Loading From GStatic As described in the docs, external image icons can only be loaded from Google’s CDN, gstatic.| Blog on Signals & Pixels
Running Earth Engine code as a serverless function is normally a multi-step process that involves manually creating a service account through the web UI, downloading a credential file to zip with your code, and enabling and configuring a handful of APIs. It’s a hassle and a great opportunity for human error with misconfigured services and accidentally committed credentials. Infrastructure-as-code (IaC) allows you to set up, configure, and deploy cloud infrastructure programmatically instead...| Blog on Signals & Pixels
I’m building a domain-specific language called Arpeggio that compiles1 code into songs. When an Arpeggio program is run, the compiler will need a way to turn parsed instructions (e.g. play a C major chord for 1/4 of a measure) into playable audio. This post goes over the process of designing and building that backend API. Building a Song in Code In Part 2, I outlined how we can represent musical concepts like keys, modes, notes, and chords in Python.| Blog on Signals & Pixels
BigQuery, a cloud platform for storing and analyzing big tabular datasets, was added as an export option in Earth Engine last year, and I’ve been looking for an excuse to test it out ever since. It’s hard to get a clear idea of capability and cost just from running tutorials and reading pricing tables, so after some brainstorming I settled on a quick weekend project: Let’s count every cloudless Landsat scene in the Earth Engine catalog to see how data coverage has evolved over 51 years.| Blog on Signals & Pixels
I encountered an interesting challenge while building a Lark parser for my domain-specific language, which I thought was worth a quick write-up. Details are below, but the TLDR is that making your grammar a little more complex by wrapping lists of rules into a new rule can simplify parsing. The Issue My language supports variable-length lists of notes that are optionally followed by a repeat symbol. In code, that looks like:| Blog on Signals & Pixels
I’m building a domain-specific language called Arpeggio that compiles code into music. In Part 1, I outlined the basic language design and syntax. Here, I’m going to do a very shallow dive into music theory from a programmer’s perspective, focusing on the terms and concepts needed to build the music backend that powers Arpeggio. Disclaimer: Music theory is a huge field of study that’s filled with complexity and ambiguity, which I’m going to vastly oversimplify down to the basic math...| Blog on Signals & Pixels
I’ve tinkered with parsers for a few small side projects like building a language server and parsing blog post metadata, but I wanted to tackle a bigger parsing project. I’ve also been trying to learn some music theory in my spare time, so why not combine the two ideas by building a domain-specific language (DSL) for notating and generating music? I’m (tentatively) calling it Arpeggio. The Concept This is going to take a few blog posts to get through, so let’s just start at the beginn...| Blog on Signals & Pixels
In an effort to simplify and speed up my blog, I’m migrating from my custom-made NextJS SPA to Hugo, which means reformatting the frontmatter header of every blog post from this: --- title: My Blog Post category: Blog tags: blog, post, blog-post date: "2023-03-18" summary: Summary goes here --- to this: +++ title: "My Blog Post" tags: ["blog", "post", "blog-post"] date: "2023-03-18" summary: "Summary goes here" +++ Not a big deal to go through and change manually, but I recently dipped my t...| Blog on Signals & Pixels
Say you’re writing a typed Python package with a function head that takes a generic tuple and returns the first element. How do you implement it for type safety? Below, we’ll take inspiration from Alexis King’s Parse, Don’t Validate blog post1 and statically typed languages like Haskell and OCaml to help us avoid potential runtime errors in Python. The Naive Solution Let’s just return the first element2: def head(t: tuple[T, .| Blog on Signals & Pixels
As far as computers are concerned, images and audio are fundamentally just arrays of numbers: signals changing over time (audio) or space (imagery). That got me thinking about what would happen if we encoded audio data into an imagery format. What would it look like, and more importantly, what would it sound like if we can manage to decode it? Enter AudioJPEG: The Worst Compression Algorithm™. Encoding Audio All we need to do to convert an audio file to an image file is:| Blog on Signals & Pixels
With summer winding down in the northern hemisphere and days getting shorter faster, I started wondering: How far would I need to travel today to get the same length of daylight tomorrow? To answer that question, we need to be able to approximate two things: Length of daylight at a given latitude on a given day. Distance from one latitude to another. Day length throughout the year by latitude. The red band shows areas with 13.| Blog on Signals & Pixels
Before AI assistants, modern IDEs, syntax highlighting, and version control, software was written by making holes in pieces of paper called punched cards. I’ve read the horror stories from this era - about waiting all day for a university computer to run your program just to find out one card was backwards, about repunching a card because you hit a wrong key on the keypunch machine, about reassembling an entire Fortran program line by line after dropping a deck of cards on the ground.| Blog on Signals & Pixels
Dask is a Python package that allows you to easily parallelize data analysis, whether you’re working with arrays, dataframes, or pretty much any other data format. With a little effort, we can get it to work with an Earth Engine feature collection, allowing us to convert from cloud-based vector data to a client-side geodataframe with parallel requests and lazy evaluation. A Quick Intro to Dask Dataframes A Dask dataframe works a lot like a Pandas dataframe with a few notable benefits:| Blog on Signals & Pixels
Area isn’t everything. Knowing how much old-growth forest, high-severity fire, elk migration corridor, etc. falls within your landscape is valuable, but often it only tells part of a story. Two fires might burn the same amount of forest, but if one torches a homogeneous swath across the landscape while the other disperses over dozens of small, disconnected patches, those forests will regenerate very differently. Patch metrics quantify spatial patterns to provide a deeper understanding of wh...| Blog on Signals & Pixels
There are two ways to download image data from Earth Engine–exporting to Drive or downloading from URL. Each has pros and cons, but for quickly pulling data into Python, URL downloading is the only option (just watch out for the pesky 32 Mb file size limit). I built a Python package called wxee a while back that uses the URL download system to turn Earth Engine image collections into xarray datasets, and as part of some much-needed improvements to that package, I decided to take a closer lo...| Blog on Signals & Pixels
My first attempts at programming were writing hacky embedded C++ to run Arduino microcontrollers, brute forcing my way through compiler errors to try to make LEDs blink and motors spin. These days I write most of my code in high-level, dynamically-typed languages, but there’s still something alluring about that intersection of software and hardware - about watching your code run in the physical world. So, leaving behind the comfort of high-level abstractions and garbage-collected memory man...| Blog on Signals & Pixels
I recently built Conway’s Game of Life in Earth Engine. It was a fun experiment, but I felt like it ignored one of the coolest aspects of cellular automata in Earth Engine—easy access to petabytes of geospatial data. So I decided to build a cellular automaton that would use elevation data to roughly simulate changes in sea level. If you want to run it yourself, check out the Earth Engine app.| Blog on Signals & Pixels
There are plenty of tools to calculate slope, aspect, and hillshading from elevation data, but if you’ve ever been curious about how they’re calculated, this post goes through the process of implementing those algorithms from scratch in Python using just Numpy. A DEM height map of Mount St. Helens that we’ll use for calculating topography. Slope The concept of slope is simple: How much does elevation change within an area? Steep areas have lots of elevation change over a short distance ...| Blog on Signals & Pixels
Cellular automata are a type of computer program that can create complex, emergent behavior by applying simple rules to determine the state of cells on a grid over time. The typical cellular automaton works something like this: Create a 2D array of cells and assign a random state (e.g. alive/dead) to each cell in the array. Determine the next state of each cell based on its current state, the states of the cells around it, and a fixed set of rules.| Blog on Signals & Pixels
In a recent blog post, I found that shrinking an Earth Engine module’s size by 75% had almost no effect on import speed in the Code Editor because most of the time was spent waiting for Earth Engine to find it, not downloading its contents. If that’s the case, then is a module that’s split up across multiple files much slower to import than a module contained in a single file?| Blog on Signals & Pixels
In this post, we’ll go into the weeds on the Earth Engine module importing system to answer the question, does minifying source code speed up imports? Quick answer? A little bit, but there’s another bottleneck that makes file size almost irrelevant. What is Minifying? In web development, source code files are usually minified before they’re distributed, removing comments and extra whitespace in order to compress file sizes and speed up page loading.| Blog on Signals & Pixels
A tech blog about programming, Python, and open-source geospatial software.| Signals & Pixels
Fake Git hooks in jj using pre-commit and aliases.| Signals & Pixels