Wen Shen
5 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against modern defenses.
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.
The paper introduces PhoneWorld, a scalable pipeline that automatically converts real-world GUI trajectories and screenshots into controllable, reproducible phone-use environments, significantly improving agent performance across multiple mobile benchmarks.
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.
The paper introduces HLL, a benchmark that tests if multimodal agents can successfully substitute for human verification (like CAPTCHA) in complex, real-world workflows, finding that current agents are still brittle and fail under realistic conditions.
Papers
HLL: Can Agents Cross Humanity's Last Line of Verification?
Xinhao Song, Su Su, Sirui Song, Hongliang Wu +5 more
The paper introduces HLL, a benchmark that tests if multimodal agents can successfully substitute for human verification (like CAPTCHA) in complex, real-world workflows, finding that current agents ar…