Topic: LLMs and reinforcement learning