Topic: Efficient attention explained: the math behind linear-time transformers