Topic: LLM Inference: Autoregressive Generation and Attention KV Cache