Hao Bai
5 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes a provably secure steganography scheme based on list decoding that significantly increases embedding capacity for Large Language Models (LLMs) compared to existing methods.
PRO-CUA introduces a process-reward optimization framework that enables efficient, step-level reinforcement learning for training computer use agents by decoupling environment interaction from policy optimization.
The paper introduces EASE, a method that enhances multimodal Reinforcement Learning with Verifiable Rewards (RLVR) by providing spatial attention supervision anchored to visual evidence, significantly improving visual grounding and reasoning capabilities in VLMs.
The paper introduces OpenWebRL, an open framework that enables training visual web agents using online multi-turn Reinforcement Learning directly on live websites, achieving state-of-the-art performance on challenging web benchmarks.
This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.
Papers
HORIZON: Recoverability-Governed Curriculum for Physical-Domain Scaling
Chenhao Bai, Liqin Lu, Kaijun Wang, Hui Chen +4 more
This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.