I'm a big fan of aarch64's csel family of instructions. A single instruction can evaluate rd = cond ? rs1 : f(rs2), where cond is any condition code and f is any of f0(x) = x or f1(x) = x+1 or f2(x) = ~x or or f3(x) = -x. Want to convert a condition to a boolean? Use f1 with rs1 == rs2 == x0. Want to convert a condition to a mask? Use f2 with rs1 == rs2 == x0. Want to compute an absolute value? Use f3 with rs1 == rs2. It is pleasing that the composition of f1 and f2 is f3. I could continue e...| www.corsix.org
I'm a big fan of aarch64's csel family of instructions. A single instruction can evaluate rd = cond ? rs1 : f(rs2), where cond is any condition code and f is any of f0(x) = x or f1(x) = x+1 or f2(x) = ~x or or f3(x) = -x. Want to convert a condition to a boolean? Use f1 with rs1 == rs2 == x0. Want to convert a condition to a mask? Use f2 with rs1 == rs2 == x0. Want to compute an absolute value? Use f3 with rs1 == rs2. It is pleasing that the composition of f1 and f2 is f3. I could continue e...| www.corsix.org
| corsix.org
This blog series (1, 2, 3, 4, 5, 6, 7) has given an introduction to Wormhole and a tour of some of its low-level functionality. If it has whetted your appetite, then the follow-up is the tt-isa-documentation repository I've been writing - it is the comprehensive reference manual going into detail on all the pieces. The reference manual bookends the blog series; there's no need for more blog parts, as the manual contains all the material I'd want to cover.| www.corsix.org
Computing the CRC-32C of a 4KB buffer on a modern x86 chip can be done as a simple loop around the crc32 instruction:| www.corsix.org