The paper introduces MalSkills, a neuro-symbolic framework that detects malicious skills in the expanding agentic supply chain by analyzing security-sensitive operations across heterogeneous artifacts.
Skills are increasingly used to extend LLM agents by packaging prompts, code, and configurations into reusable modules. As public registries and marketplaces expand, they form an emerging agentic supply chain, but also introduce a new attack surface for malicious skills. Detecting malicious skills is challenging because relevant evidence is often distributed across heterogeneous artifacts and must be reasoned in context. Existing static, LLM-based, and dynamic approaches each capture only part of this problem, making them insufficient for robust real-world detection. In this paper, we present MalSkills, a neuro-symbolic framework for malicious skills detection. MalSkills first extracts security-sensitive operations from heterogeneous artifacts through a combination of symbolic parsing and LLM-assisted semantic analysis. It then constructs the skill dependency graph that links artifacts, operations, operands, and value flows across the skill. On top of this graph, MalSkills performs neuro-symbolic reasoning to infer malicious patterns or previously unseen suspicious workflows. We evaluate MalSkills on a benchmark of 200 real-world skills against 5 state-of-the-art baselines. MalSkills achieves 93% F1, outperforming the baselines by 5~87 percentage points. We further apply MalSkills to analyze 150,108 skills collected from 7 public registries, revealing 620 malicious skills. As for now, we have finished reviewing 100 of them and identified 76 previously unknown malicious skills, all of which were responsibly reported and are currently awaiting confirmation from the platforms and maintainers. These results demonstrate the potential of MalSkills in securing the agentic supply chain.
SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement
SkillAttack is a red-teaming framework that dynamically tests the exploitability…
Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem
This paper conducts a large-scale, repository-aware security analysis of AI agen…
BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning
The paper introduces BadSkill, a novel backdoor attack formulation that targets…
Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis
This paper provides the first comprehensive security analysis of the Agent Skill…
Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study
This study conducts a large-scale empirical analysis of third-party LLM agent sk…
Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems
The paper introduces Document-Driven Implicit Payload Execution (DDIPE) to demon…
SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration
The paper proposes SkillProbe, a multi-agent security auditing framework, demons…
SkillTester: Benchmarking Utility and Security of Agent Skills
SkillTester is a comprehensive tool and framework designed to benchmark both the…