Topic: MIT introduces new RL technique that moves beyond binary rewards