Hao Zheng
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces RLSpoofer, a lightweight, black-box reinforcement learning attack that demonstrates the fragile resilience of current LLM watermarking schemes by achieving a high spoofing success rate with minimal training data.
The paper introduces Evidence-Carrying Agents (ECA) to prevent multimodal agents from executing privileged actions based on unsupported or hallucinated perceptual claims, achieving near-zero unsafe execution rates.
The paper demonstrates that extrapolative weight averaging can effectively navigate and extend the correctness-efficiency frontier in code RL, leading to improved performance on complex programming tasks.
The paper introduces Dr-CiK, a new benchmark designed to evaluate agents' ability to proactively discover, filter, and utilize relevant external context for time series forecasting, demonstrating that current agents struggle significantly with this task.
Papers
Extrapolative Weight Averaging Reveals Correctness-Efficiency Frontiers in Code RL
Kunhao Zheng, Pierre Chambon, Juliette Decugis, Jonas Gehring +3 more
The paper demonstrates that extrapolative weight averaging can effectively navigate and extend the correctness-efficiency frontier in code RL, leading to improved performance on complex programming ta…