Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2021.| ACL Anthology
Highlights the desire to replace tokenization with a general method that better leverages compute and data. We'll see tokenization's fragility and review the Byte Latent Transformer arch.| ⛰️ lucalp