Jan Peters
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces using frozen, generalist value functions as differentiable surrogates to efficiently optimize and analyze new multi-embodiment robot designs without requiring repeated reinforcement learning training.
The paper proposes a coherent inverse reinforcement learning (IRL) method to improve large behavior models for robotic control, achieving superior sample efficiency and performance on complex sparse manipulation tasks compared to traditional RL baselines.
Papers
Coherent Off-Policy Improvement of Large Behavior Models with Learned Rewards
Christian Scherer, Joe Watson, Theo Gruner, Daniel Palenicek +2 more
The paper proposes a coherent inverse reinforcement learning (IRL) method to improve large behavior models for robotic control, achieving superior sample efficiency and performance on complex sparse m…