When using machine learning models in the real world, performance isn’t just about how fast your GPU can crunch numbers — it’s also about how quickly you can get your model there. Every second spent waiting on a checkpoint to load is a second your GPUs