~ similar to 2605.22321v1· 20 results
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more
The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…
Shi Liu, Xuehai Tang, Xikang Yang, Liang Lin +3 more
This paper introduces a new benchmark to test Tool Description Poisoning (TDP) attacks on LLM agents, demonstrating that even advanced models like GPT-4o are highly vulnerable and that current defense…
Yuhang Wang, Haichang Gao, Zhenxing Niu, Zhaoxiang Liu +3 more
The paper systematically evaluates six OpenClaw-series AI agent frameworks, demonstrating that these agentized systems possess significant security vulnerabilities that are distinct from and more seve…
Chiyu Zhang, Huiqin Yang, Bendong Jiang, Xiaolei Zhang +7 more
The paper introduces LITMUS, a novel benchmark that rigorously tests LLM agents for dangerous, physical-layer behavioral jailbreaks in real OS environments, revealing that current agents frequently ex…
This paper provides a systematic, layered review of security risks and defense strategies for autonomous agent frameworks, using OpenClaw as a case study to address the current lack of integrated rese…
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
This paper systematically maps the expanded attack surface of agentic AI systems, identifying new threat vectors like RAG poisoning and cross-agent manipulation, and proposes a comprehensive security…
ClawGuard is a novel runtime security framework that deterministically enforces user-confirmed rules at tool-call boundaries to protect LLM agents from indirect prompt injection.
Youness Bouchari, Matteo Boffa, Marco Mellia, Idilio Drago +2 more
The paper re-evaluates LLM agents on CTFs, finding that while general-purpose agents like claude-code are strong baselines, specialized, modular architectures significantly improve performance and con…
Huiyu Xu, Zhibo Wang, Wenhui Zhang, Ziqi Zhu +3 more
The paper introduces LoopTrap, an automated red-teaming framework that demonstrates how malicious prompts can poison the termination judgment of LLM agents, causing unbounded computation.
Hengyu An, Minxi Li, Jinghuai Zhang, Naen Xu +5 more
The paper introduces ACIArena, a unified and comprehensive evaluation framework designed to systematically test the robustness of Multi-Agent Systems against complex Agent Cascading Injection attacks.
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…
Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more
The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.
PocketAgents introduces a manifest-driven framework for autonomous defense agents, enabling measurable and attributable LLM-driven security responses by strictly controlling agent actions and telemetr…
Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more
MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…
The paper introduces Session Risk Memory (SRM), a lightweight module that enhances per-action authorization gates with trajectory-level risk assessment, significantly improving detection of distribute…
This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…
This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…