Papers similar to 2605.05868v1

~ similar to 2605.05868v1· 20 results

cs.CRcs.SERecentJun 2, 2026

SkillGuard: A Permission Framework for Agent Skills

Shidong Pan, Xiaoyu Sun, Tianyi Zhang, Dianshu Liao +2 more

SkillGuard introduces a novel, skill-centric permission framework to secure LLM agent skill ecosystems by jointly regulating both context influence and runtime action side effects.

View →

cs.CRRecentApr 23, 2026

Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more

This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…

View →

cs.CRcs.AIcs.CLRecentMay 12, 2026

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more

The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…

View →

cs.CRRecentMay 13, 2026

Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills

Wenhui He, Yue Li, Bang Fu, Huan Xing +3 more

The paper introduces SKILLSCOPE, a system that detects security-relevant behaviors in code-backed LLM skills that are not disclosed in the natural language description, finding that 9.4% of skills exh…

View →

cs.CRcs.AIRecentApr 8, 2026

SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng +5 more

SkillTrojan introduces a novel backdoor attack targeting the composition of reusable skills in agent systems, demonstrating high attack success rates with minimal impact on normal system functionality…

View →

cs.CRcs.AIRecentMar 30, 2026

Evaluating Privilege Usage of Agents with Real-World Tools

Quan Zhang, Lianhang Fu, Lvsi Lian, Gwihwan Go +4 more

The paper introduces GrantBox, a new security sandbox that evaluates how well LLM agents handle real-world tool privileges, finding that agents remain highly vulnerable to sophisticated attacks.

View →

cs.CRcs.AIcs.MARecentMay 1, 2026

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Alfredo Metere

The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…

View →

cs.CRcs.AIRecentApr 28, 2026

Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills

Lijia Lv, Xuehai Tang, Jie Wen, Jizhong Han +1 more

The paper introduces SkillGuard-Robust, a novel framework for robust, cross-file security auditing of untrusted agent skills, achieving high accuracy on large-scale package evaluations.

View →

cs.CRcs.AIcs.CLRecentApr 3, 2026

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng +4 more

The paper introduces Document-Driven Implicit Payload Execution (DDIPE) to demonstrate that malicious code can be embedded in LLM agent skill documentation, allowing supply-chain attacks to hijack age…

View →

cs.CRRecentApr 5, 2026

SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement

Zenghao Duan, Yuxin Tian, Zhiyi Yin, Liang Pang +5 more

SkillAttack is a red-teaming framework that dynamically tests the exploitability of latent vulnerabilities in LLM agent skills using adversarial prompting, demonstrating that even benign skills pose s…

View →

cs.CRcs.AIRecentMay 13, 2026

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

Ying Li, Hongbo Wen, Yanju Chen, Hanzhi Liu +2 more

The paper introduces Sefz, a semantic fuzzing framework that automatically discovers specification violations in LLM agent skills, finding a significant number of previously unknown exploitable guardr…

View →

cs.CRcs.AIeess.SYRecentMay 12, 2026

Behavioral Integrity Verification for AI Agent Skills

Yuhao Wu, Tung-Ling Li, Hongliang Liu

The paper introduces Behavioral Integrity Verification (BIV), a framework that systematically audits AI agent skills by comparing their declared capabilities against their actual implementation, revea…

View →

cs.CRcs.AIRecentApr 3, 2026

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

Zhihao Chen, Ying Zhang, Yi Liu, Gelei Deng +6 more

This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and result…

View →

cs.CRcs.AIRecentApr 3, 2026

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more

This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…

View →

cs.SEcs.AIcs.CRRecentMay 30, 2026

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Su Wang, Pin Qian, Yihang Chen, Junxian You +5 more

The paper introduces SkillReact, a framework that measures compositional risk in agent skill ecosystems, finding that even if individual skills are safe, their combination can create significant, unad…

View →

cs.SEcs.AIcs.CRRecentMay 30, 2026

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Su Wang, Pin Qian, Yihang Chen, Junxian You +5 more

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…

View →

cs.CRcs.AIRecentApr 24, 2026

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

Wenjie Xiao, Xuehai Tang, Biyu Zhou, Songlin Hu +1 more

RouteGuard is a novel detector that identifies skill poisoning in LLM agents by monitoring structured internal attention shifts, achieving high detection rates on critical skill-injection attacks.

View →

cs.CRcs.AIRecentMar 17, 2026

Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem

Florian Holzbauer, David Schmidt, Gabriel Gegenhuber, Sebastian Schrittwieser +1 more

This paper conducts a large-scale, repository-aware security analysis of AI agent skills, demonstrating that incorporating surrounding project context drastically reduces the rate of false positive ma…

View →