I’m not an optimization guru by any means. It’s never been something I’ve been allowed to focus on at work, sadly. At some jobs, performance is secondary to correctness and robustness, and at others, it’s secondary to flashy features. But, I’ve used the following tricks in hotloops Dimension reduction (esp via convolution) Branchless calculation SIMD SIMD gets a lot of love, but it’s a constant-factor improvement and can be tough to coax out of the compiler (unless you use a libra...