~ similar to 2605.06713v1· 20 results
This paper systematically maps the expanded attack surface of agentic AI systems, identifying new threat vectors like RAG poisoning and cross-agent manipulation, and proposes a comprehensive security…
AgentSOC introduces a multi-layered agentic AI framework designed to automate Security Operations Centers (SOCs) by integrating perception, anticipatory reasoning, and risk-based action planning to im…
AgenticVM is a multi-agent framework that uses LLMs and specialized tools to automate and drastically reduce the volume of software vulnerabilities into actionable, prioritized queues.
Zelin Zhang, Qi Li, Jie Cao, Lingshuang Liu +1 more
The paper analyzes the escalating security and safety threats posed by generative AI systems as they transition from merely generating content to executing real-world actions via tools and agents, fin…
The paper proposes CyberAId, a hybrid multi-agent system designed to enhance cybersecurity for financial institutions by integrating specialized LLM subagents with existing SIEM/XDR telemetry, address…
Ravish Gupta, Saket Kumar, Shreeya Sharma, Maulik Dang +1 more
The paper introduces a novel six-agent AI architecture for cybersecurity risk assessment, demonstrating high accuracy and speed compared to human experts, though its performance is ultimately limited…
Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more
The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…
Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more
This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…
The paper introduces MATRA, a systematic threat modeling framework, to assess how known LLM threats translate into concrete, deployment-specific risks within autonomous agentic AI systems.
PocketAgents introduces a manifest-driven framework for autonomous defense agents, enabling measurable and attributable LLM-driven security responses by strictly controlling agent actions and telemetr…
Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more
The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.
The paper proposes Dynamic Cyber Ranges, an advanced cyber range environment using LLM-driven Defender agents to counter the saturation of traditional security benchmarks, demonstrating that these dyn…
Automation-Exploit is a multi-agent LLM framework that enables adaptive offensive security by using a digital twin to safely test and execute high-risk memory-corruption exploits on live targets.
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Muhammad Bilal, Jon Crowcroft, Ruizhi Wang, Xiaolong Xu +1 more
The paper surveys the use of LLMs for agentic NetOps and AIOps, arguing that operational reliability depends not on the model itself, but on robust surrounding machinery and workflow-centered evaluati…
This paper proposes the first web-focused threat model for agentic browsers, demonstrating that traditional web social engineering attacks can be amplified into dangerous, reproducible threats when ex…
The paper proposes an autonomous red teaming framework combining LLMs and RL to generate sophisticated, multi-stage cyber attack campaigns, demonstrating its necessity for evaluating robust AI-enabled…
The paper analyzes the failure modes of current AI containment methods when the agent itself is the adversary, deriving five necessary architectural requirements for durable safety.
Yuhang Wang, Haichang Gao, Zhenxing Niu, Zhaoxiang Liu +3 more
The paper systematically evaluates six OpenClaw-series AI agent frameworks, demonstrating that these agentized systems possess significant security vulnerabilities that are distinct from and more seve…
Red-MIRROR is a novel multi-agent LLM system that automates complex web penetration testing by integrating a memory-reflection backbone, achieving superior performance on industry benchmarks.