The challenges associated with loading petascale datasets, crucial for training models in both vision and language processing, pose significant hurdles in the field of deep learning. These datasets, often hosted on various cloud backends, add complexity to the training process. Existing cloud storage solutions are increasingly seen as too expensive and/or slow to handle petascale training.