Login
From:
Sam Altman
(Uncensored)
subscribe
Reinforcement Learning Progress
https://blog.samaltman.com/reinforcement-learning-progress
links
backlinks
Today, OpenAI released a new result. We used PPO (Proximal Policy Optimization), a general reinforcement learning algorithm invented by OpenAI, to train a team of 5 agents to play Dota and beat semi-pros.
Roast topics
Find topics
Find it!