Attention powers “transformers” - the seemingly complex architecture behind large language models (LLMs) like ChatGPT. But what does attention even mean?| Maharshi's blog