Trust Region Policy Optimization (TRPO) is a Policy Gradient method that addresses many of the issues of Vanilla Policy Gradients (VPG). Despite not being state-of-the-art currently, it paved the path for more robust algorithms like Proximal Policy Optimization (PPO).