Juanzi Li
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
LongTraceRL addresses long-context reasoning challenges by generating highly challenging training data and introducing a fine-grained rubric reward, significantly improving evidence-grounded reasoning in LLMs.
This paper introduces CHERRL, a controllable hacking environment for rubric-based reinforcement learning to study and mitigate reward hacking.
This paper presents EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery.
Papers
EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery
Amy Xin, Jiening Siow, Junjie Wang, Zijun Yao +4 more
This paper presents EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery.