ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.03482v2· 20 results

cs.CRcs.AIRecentMay 24, 2026

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more

MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…

View →
cs.CRcs.AIRecentJun 3, 2026

From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents

Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more

This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.

View →
cs.CRcs.AIRecentMay 28, 2026

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…

View →
cs.CRcs.AIRecentMay 28, 2026

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…

View →
cs.CRcs.AIRecentMar 20, 2026

Memory poisoning and secure multi-agent systems

Vicenç Torra, Maria Bras-Amorós

This paper analyzes memory poisoning attacks targeting multi-agent systems (MAS) powered by LLMs, proposing mitigation strategies across various memory types, especially focusing on secure design prin…

View →
cs.CRcs.AIcs.LGRecentMay 8, 2026

Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents

Jun Wen Leong

The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…

View →
cs.CRcs.AIRecentApr 3, 2026

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Wei Zou, Mingwen Dong, Miguel Romero Calvo, Shuaichen Chang +6 more

The paper introduces eTAMP, a novel attack that poisons LLM web agents' memory using only environmental observations, demonstrating cross-site and cross-session compromise without direct memory access…

View →
cs.CRcs.AIRecentMay 14, 2026

MemLineage: Lineage-Guided Enforcement for LLM Agent Memory

Ciyan Ouyang, Rui Hou

MemLineage introduces a novel, cryptographically-backed defense mechanism that enforces a chain-of-custody for LLM agent memory, preventing untrusted or poisoned state from justifying sensitive action…

View →
cs.CRcs.AIRecentMar 26, 2026

PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Haozhen Wang, Haoyue Liu, Jionghao Zhu, Zhichao Wang +2 more

The paper introduces PIDP-Attack, a novel compound adversarial attack that combines prompt injection with database poisoning to manipulate Retrieval-Augmented Generation (RAG) systems against arbitrar…

View →
cs.CRcs.AIcs.LGRecentMay 12, 2026

The Misattribution Gap: When Memory Poisoning Looks Like Model Failure in Agentic AI Systems

Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala +2 more

The paper identifies the Misattribution Gap, showing that memory-layer attacks (Semantic Norm Drift) can mimic model failure in multi-agent AI systems, and proposes novel detection and mitigation tech…

View →
cs.CRcs.AIRecentMay 14, 2026

Hidden in Memory: Sleeper Memory Poisoning in LLM Agents

Sidharth Pulipaka, Stanislau Hlebik, Leonidas Raghav, Sahar Abdelnabi +3 more

The paper introduces and evaluates 'sleeper memory poisoning,' a delayed adversarial attack that corrupts an LLM agent's persistent memory by manipulating external context, demonstrating that these po…

View →
cs.CRcs.LGRecentMay 27, 2026

MRMMIA: Membership Inference Attacks on Memory in Chat Agents

Kai Chen, Yan Pang, Tianhao Wang

The paper proposes Multi-Recall Memory MIA (MRMMIA), a unified attack framework to test for privacy leakage by determining if a candidate memory unit belongs to a chat agent's private memory store.

View →
cs.CRcs.AIRecentMay 26, 2026

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Zhe Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong +3 more

The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly…

View →
cs.CRcs.AIcs.LGRecentMay 18, 2026

OEP: Poisoning Self-Evolving LLM Agents via Locally Correct but Non-Transferable Experiences

Kaixiang Wang, Jiong Lou, Zhaojiacheng Zhou, Jie Li

The paper introduces Obsessive Experience Poisoning (OEP), a low-privilege black-box attack that poisons self-evolving LLM agents by generating locally correct but harmful experiences, causing dangero…

View →
cs.CRcs.AIRecentMay 3, 2026

Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration

Debeshee Das, Julien Piet, Darya Kaviani, Luca Beurer-Kellner +2 more

The paper introduces Trojan Hippo, a persistent memory attack that exfiltrates sensitive data from LLM agents by planting dormant payloads into long-term memory, and develops a comprehensive framework…

View →
cs.CRcs.AIRecentMay 9, 2026

ShadowMerge: A Novel Poisoning Attack on Graph-Based Agent Memory via Relation-Channel Conflicts

Yang Luo, Zifeng Kang, Tiantian Ji, Xinran Liu +3 more

The paper introduces SHADOWMERGE, a novel poisoning attack that successfully compromises graph-based agent memory by exploiting relation-channel conflicts, achieving a high attack success rate across…

View →
cs.CRcs.AIRecentApr 30, 2026

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

Prashant Kulkarni

The paper introduces 'adversarial restlessness,' an activation-level signature in LLM residual streams, to detect multi-turn prompt injection attacks with high accuracy.

View →
cs.CRcs.AIRecentApr 24, 2026

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

Wenjie Xiao, Xuehai Tang, Biyu Zhou, Songlin Hu +1 more

RouteGuard is a novel detector that identifies skill poisoning in LLM agents by monitoring structured internal attention shifts, achieving high detection rates on critical skill-injection attacks.

View →
cs.CRcs.CLcs.IRRecentMay 27, 2026

SilentRetrieval: Hijacking Retrieval-Augmented Generation via Semantically-Preserving Adversarial Data Poisoning

Jiachen Qian

SilentRetrieval introduces a sophisticated, two-stage data poisoning attack that successfully hijacks Retrieval-Augmented Generation (RAG) systems by injecting adversarially crafted, yet highly fluent…

View →
cs.CRcs.AIcs.CLRecentMay 4, 2026

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more

The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…

View →