Topic: RAG with GRPO Fine-Tuned Reasoning Model