UC Berkeley Researchers Introduce Video Prediction Rewards (VIPER): An Algorithm That Leverages Pretrained Video Prediction Models As Action-Free Reward Signals For Reinforcement Learning
Designing a reward operate by hand is time-consuming and may end up in unintended penalties. It ...
Read more