At Character.AI, our infrastructure handles thousands of GPUs, powering billions of active user seconds and supporting millions of users every month. This massive scale produces a staggering amount of log data, which is essential for monitoring the performance and reliability of our service. From Fragmentation to Centralization Initially, our