~ similar to 2605.26154v1· 20 results
The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…
The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…
Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more
This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.
Debeshee Das, Julien Piet, Darya Kaviani, Luca Beurer-Kellner +2 more
The paper introduces Trojan Hippo, a persistent memory attack that exfiltrates sensitive data from LLM agents by planting dormant payloads into long-term memory, and develops a comprehensive framework…
Bingyu Yan, Xiaoming Zhang, Jinyu Hou, Chaozhuo Li +3 more
Evo-Attacker introduces a memory-augmented reinforcement learning framework to perform generalized, long-horizon tool attacks on LLM-MAS, significantly outperforming existing methods.
MemLineage introduces a novel, cryptographically-backed defense mechanism that enforces a chain-of-custody for LLM agent memory, preventing untrusted or poisoned state from justifying sensitive action…
The paper identifies 'memory-induced tool-drift,' a systematic vulnerability where personality biases stored in an LLM agent's memory silently corrupt tool-calling decisions, even when those biases ar…
Yang Luo, Zifeng Kang, Tiantian Ji, Xinran Liu +3 more
The paper introduces SHADOWMERGE, a novel poisoning attack that successfully compromises graph-based agent memory by exploiting relation-channel conflicts, achieving a high attack success rate across…
Wei Zou, Mingwen Dong, Miguel Romero Calvo, Shuaichen Chang +6 more
The paper introduces eTAMP, a novel attack that poisons LLM web agents' memory using only environmental observations, demonstrating cross-site and cross-session compromise without direct memory access…
Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more
The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…
The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…
The paper introduces Obsessive Experience Poisoning (OEP), a low-privilege black-box attack that poisons self-evolving LLM agents by generating locally correct but harmful experiences, causing dangero…
Shi Liu, Xuehai Tang, Xikang Yang, Liang Lin +3 more
This paper introduces a new benchmark to test Tool Description Poisoning (TDP) attacks on LLM agents, demonstrating that even advanced models like GPT-4o are highly vulnerable and that current defense…
The paper identifies Mid-Session Tool Injection (MSTI) as a novel threat in the WebMCP protocol, demonstrating that attackers can manipulate the visible or perceived set of tools available to AI agent…
This paper analyzes memory poisoning attacks targeting multi-agent systems (MAS) powered by LLMs, proposing mitigation strategies across various memory types, especially focusing on secure design prin…
The paper introduces MEMSAD, a gradient-coupled defense mechanism that provides certified anomaly detection against memory poisoning attacks in retrieval-augmented LLM agents, achieving perfect detect…
This paper systematically maps the expanded attack surface of agentic AI systems, identifying new threat vectors like RAG poisoning and cross-agent manipulation, and proposes a comprehensive security…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
The paper introduces and evaluates 'sleeper memory poisoning,' a delayed adversarial attack that corrupts an LLM agent's persistent memory by manipulating external context, demonstrating that these po…
The paper introduces TRUSTDESC, a novel framework that prevents tool poisoning attacks in LLM applications by automatically generating highly accurate and trusted tool descriptions directly from the t…