Topic: [2401.10774] Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads