~ similar to 2603.20279v1· 20 results
The paper introduces C-MADF, a causally constrained multi-agent framework that significantly reduces false positives in autonomous cyber defense by restricting response actions to structurally consist…
Chris Hicks, Elizabeth Bates, Shae McFadden, Isaac Symes Thompson +11 more
This paper synthesizes expert knowledge from a workshop to provide a comprehensive framework and best-practice guidelines for developing high-quality reinforcement learning environments for autonomous…
Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more
The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…
The paper proposes an autonomous red teaming framework combining LLMs and RL to generate sophisticated, multi-stage cyber attack campaigns, demonstrating its necessity for evaluating robust AI-enabled…
Yaoyang Luo, Zhi Zheng, Ziwei Zhao, Tong Xu +4 more
This paper addresses the threat of coordinated misinformation in LLM-based Multi-Agent Systems by proposing a defense framework, STAR, that effectively identifies and rectifies misleading information…
This paper empirically demonstrates that the architectural design of multi-agent systems significantly impacts their security, finding that coordination mechanisms can introduce vulnerabilities greate…
The paper proposes Dynamic Cyber Ranges, an advanced cyber range environment using LLM-driven Defender agents to counter the saturation of traditional security benchmarks, demonstrating that these dyn…
Zhen Huang, Zhihuang Liu, Mengxuan Luo, Weishang Wu +1 more
The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe acti…
Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more
The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.
This paper proposes a set of design principles and a conceptual benchmark (SOC-bench) to systematically evaluate the blue team operational capabilities of multi-agent AI systems in autonomous Security…
The paper evaluates Language Model Agents (LMAs) for red-teaming by benchmarking their ability to perform lateral movement, finding that expert-defined action plans are most effective, though all moda…
Yihe Fan, Changyi Li, Lichen Xu, Xudong Pan +3 more
The paper introduces CyberEvolver, a self-evolving agent framework that iteratively revises its own operational scaffold based on failed execution attempts, significantly improving cybersecurity agent…
Kerri Prinos, Lilianne Brush, Cameron Denton, Zhanqi Wang +4 more
The paper proposes a tool-mediated LLM architecture for autonomous cyber defense, formally proving its stability and demonstrating that it significantly reduces an attacker's expected payoff in real-w…
Hengyu An, Minxi Li, Jinghuai Zhang, Naen Xu +5 more
The paper introduces ACIArena, a unified and comprehensive evaluation framework designed to systematically test the robustness of Multi-Agent Systems against complex Agent Cascading Injection attacks.
The paper forecasts that agentic AI will compress the cyber attack lifecycle by lowering the cost of multiple attack stages, necessitating immediate operational security upgrades for enterprises and t…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Zelin Wan, Jin-Hee Cho, Mu Zhu, Ahmed H. Anwar +2 more
This paper proposes using cyber deception with honey drones (HDs) to defend UAV mission systems against Denial-of-Service (DoS) attacks, achieving superior performance using a novel Hypergame-Theoreti…
The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…
This paper provides a systematic, layered review of security risks and defense strategies for autonomous agent frameworks, using OpenClaw as a case study to address the current lack of integrated rese…
Ravish Gupta, Saket Kumar, Shreeya Sharma, Maulik Dang +1 more
The paper introduces a novel six-agent AI architecture for cybersecurity risk assessment, demonstrating high accuracy and speed compared to human experts, though its performance is ultimately limited…