There is a simple but often overlooked technique to optimize performance-sensitive code: merging (or manually inlining) functions. We often build series of low-level functions that execute various computations that we then combine to perform higher-level tasks. When taken in isolation, each of those functions does exactly what it should and might even be perfectly optimized. However, when a series of thosefunctions work together, unnecessary or duplicated work might appear. Let’s look at a...| Posts on Romain Guy
The Android Runtime (ART) offers a nice memory safety feature when accessing the content of an array. The indices you use are automatically checked against the bounds of the array to prevent unsafe memory accesses. To achieve this, ART generates extra machine instructions to throw an ArrayIndexOutOfBoundsException when the index is invalid. Here is a simple Kotlin example: 1fun scaleZ(values: FloatArray, scale: Float) = values[2] * scale After translation to arm64 assembly, we obtain the foll...| Romain Guy
Before we dive into today’s topic, I would like to make it clear that what follows is specific to how Android, and more precisely the Android RunTime (ART), works. Some of what follows applies to other environments as well, but the main twist is about Android. If you have read my previous articles, you should know by now that seemingly small or irrelevant changes can have a large impact on performance. So let’s look at another one! We’ll start our journey with a class containing a bunch...| Posts on Romain Guy
In the last post, we saw that benchmarks don’t always measure what we think they measure. Let’s look at another instance of this problem today, starting with this rather simple benchmark: 1@RunWith(AndroidJUnit4::class) 2classDataBenchmark { 3@get:Rule 4valbenchmarkRule = BenchmarkRule() 5 6// Generate data in [0..255] 7privatevaldata = IntArray(65_536) { 8it % 256 9 } 1011@Test12funprocessData() { 13varsum = 0f14benchmarkRule.measureRepeated { 15for (dindata) { 16if (d < 128) { 17sum+=d ...| Posts on Romain Guy
Optimizing code can be a difficult task because there are so many traps you need to avoid at every step of the way. Today I want to focus on one of the (numerous) benchmarking traps, which you may have run into, and that I myself encounter regularly. Let’s imagine you are trying to optimize code, and you notice the use of a value.pow(2f). One obvious way to optimize this is to replace the function call with a multiplication (value * value), but is it worth it? Since you are a diligent engin...| Romain Guy
BlurHash is a compact representation of placeholders for images. A blur hash is encoded as a short string that can be rendered to a bitmap at runtime to display a “blurry” version of the source image. The way it works remind me of how spherical harmonics are used in 3D rendering engines to efficiently encode irradiance. I recently remembered that I had been meaning to look at the Kotlin implementation of BlurHash to see if there was a way to make it faster. I looked up the KMP port and af...| Romain Guy
Today I would like to show you a micro-optimization I recently used more for the fun of it than for its real impact. It’s an interesting trick that you should never bother to use, nor worry about. It starts from a piece of Kotlin code that looked a bit like this (the original version used named constants, but I replaced them with their actual values for clarity in this context):| www.romainguy.dev
Jake Wharton recently caused me to go down yet another silly optimization rabbit hole when he nonchalantly linked to a piece of code used to count the number of digits in a Long during a Slack conversation about Kotlin’s lack of ternary operator. This of course triggered folks like Madis Pink and me to want to optimize it… Counting digits Link to heading The simplest way to count the number of digits would be to compute log10(n).| www.romainguy.dev
I recently discussed an optimization that I worked on following Leland’s successful nerd snipe. That, however, was not the end of it. He also needed to test for intersecting/overlapping rectangles. The most obvious way to achieve this is pretty straightforward: 1// A rectangle is defined by its left (l), top (t), 2// right (r), and bottom (b) coordinates 3data class Rect(val l: Int, val t: Int, val r: Int, val b: Int) { 4 fun overlaps(other: Rect) = 5 l < other.| www.romainguy.dev
While my work responsibilities do not leave me much time to write code nowadays, I have managed to make a few small contributions to Jetpack Compose in the last few months, mostly focusing on performance. If you are an Android app developer, your performance concerns probably start and stop at a fairly high level 1. I find working on large scale libraries like Compose fascinating because you need to worry about performance not only at a macro level, but also at a micro level.| www.romainguy.dev
Leland and I were recently discussing how to best implement a new data structure to speed up a specific aspect of Jetpack Compose. He came up with a great idea, and nerd sniped me in the process. The problem was to efficiently encode the occupancy of an 8x8 grid represented as a Long (each bit representing a cell in the grid). After coming up with the bit twiddling code that quickly “rasterizes” a rectangle into the grid as a bitfield, I found myself thinking about how incredibly helpful ...| www.romainguy.dev