Topic: [2305.07185] MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers