~ similar to 2605.28104· 20 results
Siyuan Li, Zehao Liu, Xi Lin, Qinghua Mao +5 more
CoopGuard is a novel stateful, multi-round defense framework using cooperative agents to significantly reduce the success rate of evolving adversarial attacks against Large Language Models.
Hengyu An, Minxi Li, Jinghuai Zhang, Naen Xu +5 more
The paper introduces ACIArena, a unified and comprehensive evaluation framework designed to systematically test the robustness of Multi-Agent Systems against complex Agent Cascading Injection attacks.
Zhen Huang, Zhihuang Liu, Mengxuan Luo, Weishang Wu +1 more
The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe acti…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
The paper evaluates Language Model Agents (LMAs) for red-teaming by benchmarking their ability to perform lateral movement, finding that expert-defined action plans are most effective, though all moda…
This paper analyzes the failure of current embedding-based defenses in multi-agent LLM systems and proposes using token-level confidence scores (logits) for improved robustness.
Youness Bouchari, Matteo Boffa, Marco Mellia, Idilio Drago +2 more
The paper re-evaluates LLM agents on CTFs, finding that while general-purpose agents like claude-code are strong baselines, specialized, modular architectures significantly improve performance and con…
This paper analyzes memory poisoning attacks targeting multi-agent systems (MAS) powered by LLMs, proposing mitigation strategies across various memory types, especially focusing on secure design prin…
The paper proposes a novel '3+1' heterogeneous multi-agent architecture using cloud LLMs and a local verifier to achieve high-accuracy, cost-effective code vulnerability detection, significantly outpe…
This paper systematically maps the expanded attack surface of agentic AI systems, identifying new threat vectors like RAG poisoning and cross-agent manipulation, and proposes a comprehensive security…
Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian +7 more
The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk…
This paper provides a systematic, layered review of security risks and defense strategies for autonomous agent frameworks, using OpenClaw as a case study to address the current lack of integrated rese…
The paper proposes an autonomous red teaming framework combining LLMs and RL to generate sophisticated, multi-stage cyber attack campaigns, demonstrating its necessity for evaluating robust AI-enabled…
Jinhu Qi, Muzhi Li, Jiahong Liu, Yuqin Shu +8 more
This survey provides a comprehensive, practical guide to ensuring the trustworthiness of complex, autonomous agentic AI systems by focusing on safety, robustness, privacy, and system security.
The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…
The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…
Zi Liang, Ronghua Li, Yanyun Wang, Qingqing Ye +1 more
This paper introduces Mobius Injection, a novel, lightweight attack that weaponizes autonomous LLM agents into zombie nodes to launch highly scalable AbO-DDoS attacks by exploiting a vulnerability cal…
This paper empirically demonstrates that the architectural design of multi-agent systems significantly impacts their security, finding that coordination mechanisms can introduce vulnerabilities greate…
This paper models the security risks of subagent spawning in multi-agent networks, demonstrating that insecure memory inheritance from parent agents allows local compromises to spread across system bo…
This paper introduces and evaluates guardian-based defenses, showing that an intermediary LLM agent can significantly reduce the success rate of skill injection attacks on terminal-based agents, even…