I-JEPA methodoly teaches a vision transformer model to predict parts of an image in the latent space rather than the pixel space.| DebuggerCafe