The paper analyzes a dataset of agent skills, demonstrating that different security scanners (VirusTotal, static analysis, SkillSpector) rarely agree, necessitating a layered governance approach for skill security.
Agent skills extend AI agents with reusable instructions, tools, scripts, references, and workflows, establishing a security boundary distinct from both model safety and traditional package-malware detection. ClawHub Security Signals is a sanitized dataset of 67,453 latest public OpenClaw skill versions. Each row pairs redacted SKILL.md content and sanitized bundled files where present with a final ClawScan registry verdict and evidence from three scanner families: VirusTotal, static heuristic analysis, and NVIDIA SkillSpector. Rather than estimating malicious-skill prevalence, we study scanner disagreement. The three scanners rarely flag the same skills: any pair overlaps on at most 10.4% of their combined positives, only 0.69% of skills are flagged by all three, and 81.9% of flagged skills are identified by a single scanner. The disagreement is structured by attack surface. SkillSpector, which raises semantic agentic-risk advisories rather than malware-reputation signals, is positive for 19,209 of 25,504 suspicious rows (75.3%) but only 14 of 206 malicious rows (6.8%). The malicious-verdict region shows the inverse profile: 150 of 206 malicious rows (72.8%) are VirusTotal-positive, consistent with bundled-code malware evidence. These results show that agent-skill security requires layered governance, not single-scanner allow/block decisions. The corpus is released as a sanitized silver-standard dataset: labels are the registry's automated verdicts, not human-annotated ground truth, and the release represents an early, versioned snapshot intended to support the community while a human-annotated subset is developed. Further research is encouraged, including models tailored for skill-security triage.
Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem
This paper conducts a large-scale, repository-aware security analysis of AI agen…
SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills
SkillSieve introduces a three-layer hierarchical framework to detect malicious A…
BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning
The paper introduces BadSkill, a novel backdoor attack formulation that targets…
SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems
SkillTrojan introduces a novel backdoor attack targeting the composition of reus…
SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement
SkillAttack is a red-teaming framework that dynamically tests the exploitability…
Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis
This paper provides the first comprehensive security analysis of the Agent Skill…
SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration
The paper proposes SkillProbe, a multi-agent security auditing framework, demons…
"Elementary, My Dear Watson." Detecting Malicious Skills via Neuro-Symbolic Reasoning across Heterog…
The paper introduces MalSkills, a neuro-symbolic framework that detects maliciou…