Since the work of Kaligosi and Sanders (2006), it is well-known that Quicksort -- which is commonly considered as one of the fastest in-place sorting algorithms -- suffers in an essential way from branch mispredictions. We present a novel approach to address this problem by partially decoupling control from data flow: in order to perform the partitioning, we split the input in blocks of constant size (we propose 128 data elements); then, all elements in one block are compared with the pivot a...| arXiv.org
The Continental Club in Austin, Texas, USA| The D Blog
Defined in header <algorithm>| en.cppreference.com
Branchless Lomuto Partitioning| orlp.net