~ similar to 2604.04442v1· 20 results
The paper proposes an autonomous red teaming framework combining LLMs and RL to generate sophisticated, multi-stage cyber attack campaigns, demonstrating its necessity for evaluating robust AI-enabled…
The paper evaluates Language Model Agents (LMAs) for red-teaming by benchmarking their ability to perform lateral movement, finding that expert-defined action plans are most effective, though all moda…
Kerri Prinos, Lilianne Brush, Cameron Denton, Zhanqi Wang +4 more
The paper proposes a tool-mediated LLM architecture for autonomous cyber defense, formally proving its stability and demonstrating that it significantly reduces an attacker's expected payoff in real-w…
Philip Huff, Dakota Dale, Harshith Guduru, Rohan Singh +1 more
The paper proposes a system that operationalizes cybersecurity governance frameworks by integrating them with attack-path modeling and Deep Reinforcement Learning to generate practical, resource-const…
DeepStage is a deep reinforcement learning framework that achieves autonomous, stage-aware defense against multi-stage APT campaigns by fusing graph-based telemetry and predicting attacker stages.
DeepXplain introduces an explainable deep reinforcement learning framework that enhances the trustworthiness and effectiveness of autonomous cyber defense against multi-stage APT campaigns by integrat…
Chris Hicks, Elizabeth Bates, Shae McFadden, Isaac Symes Thompson +11 more
This paper synthesizes expert knowledge from a workshop to provide a comprehensive framework and best-practice guidelines for developing high-quality reinforcement learning environments for autonomous…
ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…
The paper proposes Dynamic Cyber Ranges, an advanced cyber range environment using LLM-driven Defender agents to counter the saturation of traditional security benchmarks, demonstrating that these dyn…
The paper introduces Agent-ToM, a Theory-of-Mind (ToM) based framework that learns to monitor autonomous LLM agents by explicitly reasoning about their hidden beliefs and intentions to detect covert m…
The paper proposes the Policy-Execution-Authorization (PEA) architecture, a separation-of-powers system designed to structurally enforce goal integrity in AI agents, moving safety from a probabilistic…
This paper empirically demonstrates that the architectural design of multi-agent systems significantly impacts their security, finding that coordination mechanisms can introduce vulnerabilities greate…
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
The paper introduces STRIATUM-CTF, a modular agentic framework that uses a standardized context protocol to enable LLMs to perform multi-step, stateful reasoning for general-purpose CTF solving, achie…
This paper demonstrates that using a communication algorithm (CommFormer) with heterogeneous agents significantly improves the speed and performance of multi-agent reinforcement learning for autonomou…
This paper introduces a foundational framework and taxonomy for managing catastrophic AI loss of control (LOC) incidents, providing a proportional guide for response based on the severity and recovera…
Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more
The paper introduces an executable Proof-Constrained Action (ePCA) framework that secures AI agents by forcing them to formalize their intentions into first-order logical constraints, achieving provab…
Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more
The paper introduces a formal, logically constrained framework, ePCA, to secure advanced AI agents by forcing them to translate natural language intentions into first-order logical constraints before…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Zhen Huang, Zhihuang Liu, Mengxuan Luo, Weishang Wu +1 more
The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe acti…