The paper analyzes a dataset of agent skills, demonstrating that different security scanners (VirusTotal, static analysis, SkillSpector) rarely agree on maliciousness, necessitating layered security governance.
Agent skills extend AI agents with reusable instructions, tools, scripts, references, and workflows, establishing a security boundary distinct from both model safety and traditional package-malware detection. ClawHub Security Signals is a sanitized dataset of 67,453 latest public OpenClaw skill versions. Each row pairs redacted SKILL.md content and sanitized bundled files where present with a final ClawScan registry verdict and evidence from three scanner families: VirusTotal, static heuristic analysis, and NVIDIA SkillSpector. Rather than estimating malicious-skill prevalence, we study scanner disagreement. The three scanners rarely flag the same skills: any pair overlaps on at most 10.4% of their combined positives, only 0.69% of skills are flagged by all three, and 81.9% of flagged skills are identified by a single scanner. The disagreement is structured by attack surface. SkillSpector, which raises semantic agentic-risk advisories rather than malware-reputation signals, is positive for 19,209 of 25,504 suspicious rows (75.3%) but only 14 of 206 malicious rows (6.8%). The malicious-verdict region shows the inverse profile: 150 of 206 malicious rows (72.8%) are VirusTotal-positive, consistent with bundled-code malware evidence. These results show that agent-skill security requires layered governance, not single-scanner allow/block decisions. The corpus is released as a sanitized silver-standard dataset: labels are the registry's automated verdicts, not human-annotated ground truth, and the release represents an early, versioned snapshot intended to support the community while a human-annotated subset is developed. Further research is encouraged, including models tailored for skill-security triage.
When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems
The paper introduces SkillReact, a framework that measures compositional risk in…
Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems
The paper introduces SkillVetBench, a novel two-stage benchmark that effectively…
SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents
SeClaw is a new framework that synthesizes security tasks from structured risk s…
SkillsInjector: Dynamic Skill Context Construction for LLM Agents
SkillsInjector proposes a two-stage adaptive method to dynamically optimize skil…
Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem
The paper analyzes a large sample of AI agent skills, revealing that a significa…
SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training
SIRI introduces a self-internalizing reinforcement learning framework that allow…
SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision
SkillRevise is an execution-grounded framework that iteratively refines initial,…
Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents
The paper introduces MASA, a model-aware skill alignment framework that adaptive…