Table of Contents Questions and remarks on PTHash paper Ideas for improvement Parameters Align packed vectors to cachelines Prefetching Faster modulo operations Store dictionary \(D\) sorted using Elias-Fano coding How many bits of \(n\) and hash entropy do we need? Ideas for faster construction Implementation log Hashing function Bitpacking crates Construction Fastmod TODO Try out fastdivide and reciprocal crates First benchmark Faster bucket computation Branchless, for real now! (aka the tr...| CuriousCoding
Table of Contents Abstract 1 Introduction 2 Related work 3 PtrHash 3.1 Overview 3.2 Construction 3.3 Bucket Assignment Functions 3.4 Remapping using CacheLineEF 4 Results 4.1 Construction 4.1.1 Bucket Functions 4.1.2 Tuning Parameters for Construction 4.1.3 Remap 4.2 Comparison to Other Methods 5 Conclusions and Future Work Appendix A: Query Throughput Batching and Streaming Evaluation Multi-threaded Throughput. Appendix B: Sharding Evaluation This is the HTML version of my SEA 2025 paper on ...| CuriousCoding
Table of Contents Possible speedup? BBHash Limasset et al. (2017) uses multiple layers to create a minimal perfect hashing functions (MPFH), that hashes some input set into \([n]\). (See also my note on PTHash (Pibiri and Trani 2021).) Simply said, it maps the \(n\) elements into \([\gamma \cdot n]\) using hashing function \(h_0\). The \(k_0\) elements that have collisions are mapped into \([\gamma \cdot k_0]\) using \(h_1\). Then, the \(k_1\) elements with collisions are mapped into \([\gamm...| CuriousCoding