Papers similar to 2606.00448v1

~ similar to 2606.00448v1· 20 results

cs.SEcs.AIcs.CRRecentMay 30, 2026

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

Su Wang, Pin Qian, Yihang Chen, Junxian You +5 more

The paper introduces SkillReact, a framework that measures compositional risk in agent skill ecosystems, finding that even if individual skills are safe, their combination can create significant, expl…

View →

cs.CRcs.AIcs.CLRecentMay 12, 2026

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more

The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…

View →

cs.CLRecentJun 1, 2026

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou +7 more

The paper introduces SkillHarm, a comprehensive benchmark and automated framework for evaluating skill-based attacks across the entire agent skill-use lifecycle, demonstrating that current agents rema…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…

View →

cs.CRRecentApr 5, 2026

SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement

Zenghao Duan, Yuxin Tian, Zhiyi Yin, Liang Pang +5 more

SkillAttack is a red-teaming framework that dynamically tests the exploitability of latent vulnerabilities in LLM agent skills using adversarial prompting, demonstrating that even benign skills pose s…

View →

cs.CRcs.AIRecentApr 8, 2026

SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng +5 more

SkillTrojan introduces a novel backdoor attack targeting the composition of reusable skills in agent systems, demonstrating high attack success rates with minimal impact on normal system functionality…

View →

cs.CRRecentMar 28, 2026

SafeClaw-R: Towards Safe and Secure Multi-Agent Personal Assistants

Haoyu Wang, Zibo Xiao, Yedi Zhang, Christopher M. Poskitt +1 more

The paper proposes SafeClaw-R, a novel framework that enforces safety as a system-level invariant over the execution graph to mitigate the high safety and security risks inherent in autonomous multi-a…

View →

cs.CRcs.SERecentMar 22, 2026

SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration

Zihan Guo, Zhiyu Chen, Xiaohang Nie, Jianghao Lin +2 more

The paper proposes SkillProbe, a multi-agent security auditing framework, demonstrating that high-popularity skills in LLM agent marketplaces are often insecure due to systemic combinatorial risks.

View →

cs.CRcs.AIRecentMay 12, 2026

Proteus: A Self-Evolving Red Team for Agent Skill Ecosystems

Zhaojiacheng Zhou

The paper introduces Proteus, a self-evolving red-team framework that measures the adaptive leakage risk of LLM agent skills, demonstrating that current vetting methods significantly underestimate res…

View →

cs.CRcs.AIRecentMay 13, 2026

AgentTrap: Measuring Runtime Trust Failures in Third-Party Agent Skills

Haomin Zhuang, Hanwen Xing, Yujun Zhou, Yuchen Ma +4 more

The paper introduces AgentTrap, a dynamic benchmark that measures LLM agent susceptibility to malicious side effects embedded within seemingly benign third-party skills, finding that agents often exec…

View →

cs.CRcs.AIRecentApr 16, 2026

HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?

Yukun Jiang, Yage Zhang, Michael Backes, Xinyue Shen +1 more

This paper presents HarmfulSkillBench, a large-scale benchmark demonstrating that even small percentages of publicly available skills can be misused for harmful actions, significantly lowering LLM ref…

View →

cs.CRcs.AIRecentApr 28, 2026

Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills

Lijia Lv, Xuehai Tang, Jie Wen, Jizhong Han +1 more

The paper introduces SkillGuard-Robust, a novel framework for robust, cross-file security auditing of untrusted agent skills, achieving high accuracy on large-scale package evaluations.

View →

cs.CRcs.AIRecentMar 17, 2026

Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem

Florian Holzbauer, David Schmidt, Gabriel Gegenhuber, Sebastian Schrittwieser +1 more

This paper conducts a large-scale, repository-aware security analysis of AI agent skills, demonstrating that incorporating surrounding project context drastically reduces the rate of false positive ma…

View →

cs.CRcs.AIRecentMay 13, 2026

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

Ying Li, Hongbo Wen, Yanju Chen, Hanzhi Liu +2 more

The paper introduces Sefz, a semantic fuzzing framework that automatically discovers specification violations in LLM agent skills, finding a significant number of previously unknown exploitable guardr…

View →

cs.CRcs.AIRecentMar 28, 2026

SafetyDrift: Predicting When AI Agents Cross the Line Before They Actually Do

Aditya Dhodapkar, Farhaan Pishori

The paper introduces SafetyDrift, a predictive model that forecasts when AI agents will violate safety protocols by analyzing the cumulative risk across sequences of individually safe actions.

View →

cs.CRcs.AIRecentApr 3, 2026

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more

This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…

View →

cs.AIcs.CRRecentMay 12, 2026

Under the Hood of SKILL.md: Semantic Supply-chain Attacks on AI Agent Skill Registry

Shoumik Saha, Kazem Faghih, Soheil Feizi

This paper demonstrates that the natural language metadata (SKILL.md) used to describe AI agent skills introduces significant semantic supply-chain risks, allowing attackers to manipulate discovery, s…

View →

cs.CRcs.AIRecentMay 26, 2026

ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation

Xiaochong Jiang, Shiqi Yang, Ziwei Li, Lifei Liu +2 more

ChainCaps introduces a novel runtime capability budgeting system that prevents 'permission laundering' in complex tool-using agents, significantly reducing attack success rates while maintaining benig…

View →

cs.CRcs.AIcs.MARecentMay 1, 2026

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Alfredo Metere

The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…

View →