~ similar to 2605.28646v2· 20 results
Shidong Pan, Xiaoyu Sun, Tianyi Zhang, Dianshu Liao +2 more
SkillGuard introduces a novel, skill-centric permission framework to secure LLM agent skill ecosystems by jointly regulating both context influence and runtime action side effects.
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
Zhichao Liu, Wenbo Pan, Haining Yu, Ge Gao +2 more
WebTrap introduces a stealthy, mid-task hijacking attack that successfully compromises browser agents during long-horizon tasks by seamlessly fusing malicious instructions with the original user goal.
This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…
Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more
The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…
Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more
The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…
Robert Stanley, Avi Verma, Lillian Tsai, Konstantinos Kallas +1 more
The paper introduces GAAP, an execution environment that deterministically guarantees the confidentiality of private user data by enforcing user-defined permission specifications on AI agents, even ag…
This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…
Guangsheng Yu, Qin Wang, Rui Lang, Shuai Su +1 more
PlanTwin introduces a privacy-preserving architecture that allows cloud-hosted LLMs to plan over sensitive local environments by projecting the raw state into a sanitized, abstract digital twin.
Fazhong Liu, Zhuoyan Chen, Tu Lan, Haozhen Tan +5 more
This paper identifies and characterizes 'guidance injection,' a stealthy attack vector that embeds adversarial operational narratives into autonomous coding agents' bootstrap guidance, demonstrating h…
Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng +5 more
SkillTrojan introduces a novel backdoor attack targeting the composition of reusable skills in agent systems, demonstrating high attack success rates with minimal impact on normal system functionality…
Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more
The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…
Haomin Zhuang, Hanwen Xing, Yujun Zhou, Yuchen Ma +4 more
The paper introduces AgentTrap, a dynamic benchmark that measures LLM agent susceptibility to malicious side effects embedded within seemingly benign third-party skills, finding that agents often exec…
Jiangrong Wu, Yuhong Nan, Yixi Lin, Huaijin Wang +3 more
SkillScope introduces a graph-based framework to enforce fine-grained least-privilege in LLM Agent Skills, significantly reducing over-privileged actions while maintaining task functionality.
This paper demonstrates a novel, multi-stage privacy-leakage attack chain against black-box chatbot agents by combining indirect prompt injection with web-tool invocation, showing that such attacks ar…
Zenghao Duan, Yuxin Tian, Zhiyi Yin, Liang Pang +5 more
SkillAttack is a red-teaming framework that dynamically tests the exploitability of latent vulnerabilities in LLM agent skills using adversarial prompting, demonstrating that even benign skills pose s…
The paper addresses the privacy leak of speculative tool calls by proposing Speculative Tool Privacy Contracts, a runtime abstraction that ensures observation before commitment does not disclose user…
The paper addresses the privacy leak caused by speculative tool calls in language agents by proposing Speculative Tool Privacy Contracts, a runtime mechanism that restricts information leakage before…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more
This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…