I recently began writing a bounded, wait-free MPSC queue in Rust. A very common advice for ring-buffer based implemenations is to prevent the tails and heads from mapping to the same cache line. In this article, I would like to find out the concrete performance penalty of false sharing for my data structure by performing tests on both ARM (Apple Silicon) and x86 (Intel/AMD) processors.