~ similar to 2605.20896v2· 20 results
The paper introduces AVDA, a framework that uses the Model Context Protocol (MCP) to automate cybersecurity detection authoring by integrating organizational context into AI code generation, achieving…
Chenning Li, Pan Hu, Justin Xu, Baris Ozbas +8 more
The paper introduces ADR, a novel, production-proven detection system that provides high-fidelity security monitoring for AI agents operating via the Model Context Protocol, significantly outperformin…
DeepXplain introduces an explainable deep reinforcement learning framework that enhances the trustworthiness and effectiveness of autonomous cyber defense against multi-stage APT campaigns by integrat…
The paper proposes Dynamic Cyber Ranges, an advanced cyber range environment using LLM-driven Defender agents to counter the saturation of traditional security benchmarks, demonstrating that these dyn…
ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…
Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more
This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…
Yunhao Feng, Xiaohu Du, Xinhao Deng, Yifan Ding +12 more
BraveGuard is a self-evolving defense framework that significantly improves the safety monitoring of computer-use agents by generating guard model supervision from open-world threat discovery and real…
Yunhao Feng, Yifan Ding, Xiaohu Du, Ming Wen +12 more
BraveGuard is a self-evolving defense framework that improves the safety of computer-use agents by training guard models on open-world, multi-step threat trajectories rather than static benchmarks.
The paper introduces C-MADF, a causally constrained multi-agent framework that significantly reduces false positives in autonomous cyber defense by restricting response actions to structurally consist…
The paper introduces MonitoringBench, a semi-automated red-teaming methodology that generates diverse and stronger attacks, revealing that current coding-agent monitors often fail against sophisticate…
The paper evaluates Language Model Agents (LMAs) for red-teaming by benchmarking their ability to perform lateral movement, finding that expert-defined action plans are most effective, though all moda…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
PocketAgents introduces a manifest-driven framework for autonomous defense agents, enabling measurable and attributable LLM-driven security responses by strictly controlling agent actions and telemetr…
Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more
AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…
Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa +4 more
The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.
The paper introduces Owner-Harm, a formal threat model addressing the critical blind spot of AI agents harming their own deployers, demonstrating that specialized defenses are needed beyond generic sa…
The paper proposes a novel semi-automated method to perform continuous threat modeling by inferring the actual system architecture from combined static configuration and dynamic network flow data, sig…
The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…