Papers similar to 2604.02837v1

~ similar to 2604.02837v1· 20 results

cs.CRcs.LGRecentApr 25, 2026

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…

View →

cs.CRcs.AIRecentMar 17, 2026

Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem

Florian Holzbauer, David Schmidt, Gabriel Gegenhuber, Sebastian Schrittwieser +1 more

This paper conducts a large-scale, repository-aware security analysis of AI agent skills, demonstrating that incorporating surrounding project context drastically reduces the rate of false positive ma…

View →

cs.CRcs.AIcs.CLRecentMay 12, 2026

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more

The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…

View →

cs.CRcs.SERecentMar 22, 2026

SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration

Zihan Guo, Zhiyu Chen, Xiaohang Nie, Jianghao Lin +2 more

The paper proposes SkillProbe, a multi-agent security auditing framework, demonstrating that high-popularity skills in LLM agent marketplaces are often insecure due to systemic combinatorial risks.

View →

cs.AIcs.CRRecentMay 12, 2026

Under the Hood of SKILL.md: Semantic Supply-chain Attacks on AI Agent Skill Registry

Shoumik Saha, Kazem Faghih, Soheil Feizi

This paper demonstrates that the natural language metadata (SKILL.md) used to describe AI agent skills introduces significant semantic supply-chain risks, allowing attackers to manipulate discovery, s…

View →

cs.CRcs.AIRecentMar 29, 2026

A Security Analysis of the OpenClaw AI Agent Framework

Surada Suwansathit, Yuxuan Zhang, Guofei Gu

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…

View →

cs.CRcs.AIRecentApr 30, 2026

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study

Luyao Xu, Xiang Chen

This paper provides a systematic, layered review of security risks and defense strategies for autonomous agent frameworks, using OpenClaw as a case study to address the current lack of integrated rese…

View →

cs.CRRecentApr 5, 2026

SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement

Zenghao Duan, Yuxin Tian, Zhiyi Yin, Liang Pang +5 more

SkillAttack is a red-teaming framework that dynamically tests the exploitability of latent vulnerabilities in LLM agent skills using adversarial prompting, demonstrating that even benign skills pose s…

View →

cs.CRRecentMay 14, 2026

Toward Securing AI Agents Like Operating Systems

Lukas Pirch, Micha Horlboge, Patrick Großmann, Syeda Mahnur Asif +3 more

This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…

View →

cs.CLRecentJun 1, 2026

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou +7 more

The paper introduces SkillHarm, a comprehensive benchmark and automated framework for evaluating skill-based attacks across the entire agent skill-use lifecycle, demonstrating that current agents rema…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…

View →

cs.CRcs.AIeess.SYRecentMay 12, 2026

Behavioral Integrity Verification for AI Agent Skills

Yuhao Wu, Tung-Ling Li, Hongliang Liu

The paper introduces Behavioral Integrity Verification (BIV), a framework that systematically audits AI agent skills by comparing their declared capabilities against their actual implementation, revea…

View →

cs.CRcs.AIRecentApr 28, 2026

Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills

Lijia Lv, Xuehai Tang, Jie Wen, Jizhong Han +1 more

The paper introduces SkillGuard-Robust, a novel framework for robust, cross-file security auditing of untrusted agent skills, achieving high accuracy on large-scale package evaluations.

View →

cs.CRcs.AIRecentMay 18, 2026

Agent Security is a Systems Problem

Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more

The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…

View →

cs.CRRecentApr 23, 2026

Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more

This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…

View →

cs.CRcs.AIRecentMay 13, 2026

AgentTrap: Measuring Runtime Trust Failures in Third-Party Agent Skills

Haomin Zhuang, Hanwen Xing, Yujun Zhou, Yuchen Ma +4 more

The paper introduces AgentTrap, a dynamic benchmark that measures LLM agent susceptibility to malicious side effects embedded within seemingly benign third-party skills, finding that agents often exec…

View →

cs.CRcs.AIRecentApr 8, 2026

SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng +5 more

SkillTrojan introduces a novel backdoor attack targeting the composition of reusable skills in agent systems, demonstrating high attack success rates with minimal impact on normal system functionality…

View →

cs.CRcs.AIRecentMay 26, 2026

Lessons from Penetration Tests on Large-Scale Agent Systems

Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more

The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.

View →

cs.CRcs.AIRecentMay 27, 2026

Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem

Luca Beurer-Kellner, Aleksei Kudrinskii, Marco Milanta, Kristian Bonde Nielsen +2 more

The paper analyzes a large corpus of AI agent skills, identifying a significant percentage of malicious payloads that pose serious security risks to users and systems.

View →