Juncheng Wu
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper conducts the first real-world safety evaluation of the personal AI agent OpenClaw, demonstrating that its broad system access creates inherent vulnerabilities that significantly increase the attack success rate regardless of the underlying large language model.
The paper distinguishes between a model's ability to generate useful updates for external agent components (harness-updating) and its ability to benefit from those updates (harness-benefit), finding that updating capabilities are surprisingly uniform while benefit is maximized in mid-tier models.
Papers
Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents
Minhua Lin, Juncheng Wu, Zijun Wang, Zhan Shi +13 more
The paper distinguishes between a model's ability to generate useful updates for external agent components (harness-updating) and its ability to benefit from those updates (harness-benefit), finding t…