This post is about general techniques for handling end-of-buffer checks in code that processes an input stream a byte at a time, or a few bytes at a time at the most. Concretely, I’ll be talk…| The ryg blog
A lot of bit manipulation operations exist basically everywhere: AND, OR, XOR/EOR, shifts, and so forth. Some exist only on some architectures but have obvious implementations everywhere else ̵…| The ryg blog
This is a small one, for all compression geeks out there. The Elias gamma code (closely related to Exponential-Golomb codes, or Exp-Golomb codes for short) is very useful for practical compression …| The ryg blog
(Continued from part 1.) Last time, I established the basic problem and went through various ways of doing shifting and masking, and the surprising difficulties inherent therein. The “bit ext…| The ryg blog
Resurrecting an old tradition of this blog, let’s take a simple problem, go over a way too long list of alternative solutions, and evaluate their merits. Our simple problem is this: we want t…| The ryg blog
Nobody should voluntarily call themselves that. “Content” is the language of people on the distribution side of things. If you look at something like a 19th century newspaper it’s…| The ryg blog
In the last part we went over the general ideas of Huffman coding as implemented in the newer Oodle Data coders, this time we’ll be looking at one particular implementation that is both inter…| The ryg blog
(Continued from part 2. “A whirlwind introduction to dataflow graphs” is required reading.) Last time, we saw a whole bunch of different bit reader implementations. This time, I’l…| The ryg blog
Last time I covered the big picture, so we know the ground rules for the modular entropy coding layer in Oodle Data: bytestream consisting of several independent streams, pluggable algorithms, byte…| The ryg blog
There’s a hardware problem affecting Intel 13th/14th gen CPUs, mostly desktop ones. This made the rounds through the press last year and has been on forums etc. for much longer than that. For…| The ryg blog
I mentioned in a previous post that doing exact UNORM or SNORM conversions to float in hardware was not particularly expensive, but didn’t go into detail how. Let’s rectify that! (If yo…| The ryg blog
For BC6H encoding in Oodle Texture, we needed a sensible error metric to use in the encoder core. BC6H is HDR which is more challenging than LDR data since we need to handle vast differences in magnitude. BC6H internally essentially treats the float16 bits as 16-bit integers (which is a semi-logarithmic mapping) and works with […]| The ryg blog
GPUs support UNORM formats that represent a number inside [0,1] as an 8-bit unsigned integer. In exact arithmetic, the conversion to a floating-point number is straightforward: take the integer and…| The ryg blog
This is a small but nifty solution I discovered while working on IO code for Iggy. I don’t claim to have invented it – as usual in CS, this was probably first published in the 60s or 70…| The ryg blog
That’s right, it’s another texture compression blog post! I’ll keep it short. By “solid-color block”, I mean a 4×4 block of pixels that all have the same color. ASTC has a dedicated encoding for these (“void-extent blocks”), BC7 does not. Therefore we have an 8-bit RGBA input color and want to figure out how to […]| The ryg blog
A while back I had to deal with a bit-packed format that contained a list of integer values encoded in one of a pre-defined sets of bit widths, where both the allowed bit widths and the signed-ness…| The ryg blog
The x86 instruction set has a somewhat peculiar set of SIMD integer multiply operations, and Intel’s particular implementation of several of these operations in their headline core designs ha…| The ryg blog
This one originally came up for me in Oodle Texture’s BC7 decoder. In the BC7 format, each pixel within a 4×4 block can choose from a limited set of between 4 to 16 colors (ignoring some…| The ryg blog
This is a result I must have re-derived at least 4 times by now in various ways, but this time I’m writing it down so I just have a link next time. All right. If you’re encoding a BCn o…| The ryg blog
BC1-7 (variously also known by the older names S3TC, DXTC, RGTC and BPTC) are the standard compressed texture formats on PC. The newer BC6H and BC7 formats have precise specs that require bit-exact…| The ryg blog
This post is part of a series – go here for the index. And welcome back to my impromptu optimization series. Today, we won’t see a single line of code, nor a profiler screenshot. That&#…| The ryg blog
It’s been a while! Last time, I went over how the 3-stream Huffman decoders in Oodle Data work. The 3-stream layout is what we originally went with. It gives near-ideal performance on the las…| The ryg blog
While in the middle of writing “Reading bits in far too many ways, part 3”, I realized that I had written a lot of background material that had absolutely nothing to do with bit I/O and…| The ryg blog