Zhaojiacheng Zhou
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Proteus, a self-evolving red-team framework that measures the adaptive leakage risk of LLM agent skills, demonstrating that current vetting methods significantly underestimate residual risk against iterative attackers.
The paper introduces Obsessive Experience Poisoning (OEP), a low-privilege black-box attack that poisons self-evolving LLM agents by generating locally correct but harmful experiences, causing dangerous over-generalization during reflection.
Papers
OEP: Poisoning Self-Evolving LLM Agents via Locally Correct but Non-Transferable Experiences
The paper introduces Obsessive Experience Poisoning (OEP), a low-privilege black-box attack that poisons self-evolving LLM agents by generating locally correct but harmful experiences, causing dangero…