~ similar to 2604.21829v2· 20 results
Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng +5 more
SkillTrojan introduces a novel backdoor attack targeting the composition of reusable skills in agent systems, demonstrating high attack success rates with minimal impact on normal system functionality…
Zenghao Duan, Yuxin Tian, Zhiyi Yin, Liang Pang +5 more
SkillAttack is a red-teaming framework that dynamically tests the exploitability of latent vulnerabilities in LLM agent skills using adversarial prompting, demonstrating that even benign skills pose s…
Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more
This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…
Zhihao Chen, Ying Zhang, Yi Liu, Gelei Deng +6 more
This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and result…
Shidong Pan, Xiaoyu Sun, Tianyi Zhang, Dianshu Liao +2 more
SkillGuard introduces a novel, skill-centric permission framework to secure LLM agent skill ecosystems by jointly regulating both context influence and runtime action side effects.
Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng +4 more
The paper introduces Document-Driven Implicit Payload Execution (DDIPE) to demonstrate that malicious code can be embedded in LLM agent skill documentation, allowing supply-chain attacks to hijack age…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…
This paper demonstrates that the natural language metadata (SKILL.md) used to describe AI agent skills introduces significant semantic supply-chain risks, allowing attackers to manipulate discovery, s…
This paper introduces and evaluates guardian-based defenses, showing that an intermediary LLM agent can significantly reduce the success rate of skill injection attacks on terminal-based agents, even…
This paper proposes and evaluates guardian-based defenses, both dynamic and static, to mitigate skill injection attacks targeting LLM agents that rely on reusable procedural skills.
Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more
The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…
Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more
The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…
Shihao Weng, Yang Feng, Jinrui Zhang, Xiaofei Xie +2 more
The paper introduces ARGUS, a defense mechanism that uses provenance-aware decision auditing to protect LLM agents from sophisticated, context-aware prompt injection attacks, significantly reducing th…
Shenao Wang, Junjie He, Yanjie Zhao, Yayi Wang +2 more
The paper introduces MalSkills, a neuro-symbolic framework that detects malicious skills in the expanding agentic supply chain by analyzing security-sensitive operations across heterogeneous artifacts…
Yiyong Liu, Chia-Yi Hsu, Chun-Ying Huang, Michael Backes +2 more
This paper introduces Dependency Steering, a novel attack paradigm demonstrating that malicious agent skills can actively bias LLM coding agents to use attacker-controlled packages, posing a significa…
Youness Bouchari, Matteo Boffa, Marco Mellia, Idilio Drago +2 more
The paper re-evaluates LLM agents on CTFs, finding that while general-purpose agents like claude-code are strong baselines, specialized, modular architectures significantly improve performance and con…