Rui Zhou
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper analyzes protracted vulnerabilities (PCVEs) in open-source projects and proposes DeeptraVul, an enhanced detection approach that significantly improves vulnerability coverage by integrating multiple development artifacts and an LLM.
MACReD introduces a hierarchical multi-agent framework that achieves state-of-the-art performance in parsing complex chemical reaction diagrams by coordinating specialized agents for perception and global reasoning.
ESPO is a novel reinforcement learning algorithm that detects trajectory failure in large language models and terminates rollouts early, significantly improving performance on mathematical reasoning benchmarks while reducing computational cost.
Papers
ESPO: Early-Stopping Proximal Policy Optimization
Zihang Li, Rui Zhou, Yingcheng Shi, Wenhan Yu +7 more
ESPO is a novel reinforcement learning algorithm that detects trajectory failure in large language models and terminates rollouts early, significantly improving performance on mathematical reasoning b…