Taiwei Shi
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces OS-BLIND, a benchmark demonstrating that current safety evaluations fail to detect critical vulnerabilities in computer-use agents when user instructions are benign, showing high attack success rates even for safety-aligned models.
The paper proposes ReuseRL, a method that improves agent generalization in Reinforcement Learning by enforcing structural compressibility of successful agent trajectories into reusable skills.
Papers
Skill Reuse as Compression in Agentic RL
Zhikun Xu, Yu Feng, Jacob Dineen, Taiwei Shi +2 more
The paper proposes ReuseRL, a method that improves agent generalization in Reinforcement Learning by enforcing structural compressibility of successful agent trajectories into reusable skills.