ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.01970v3· 20 results

cs.CRcs.AIcs.LGRecentMay 8, 2026

Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents

Jun Wen Leong

The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…

View →
cs.CRcs.AIRecentMay 28, 2026

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…

View →
cs.CRcs.AIRecentMay 28, 2026

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…

View →
cs.CRcs.AIRecentMay 24, 2026

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more

MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…

View →
cs.CRcs.AIRecentJun 3, 2026

From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents

Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more

This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.

View →
cs.CRcs.AIRecentApr 3, 2026

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Wei Zou, Mingwen Dong, Miguel Romero Calvo, Shuaichen Chang +6 more

The paper introduces eTAMP, a novel attack that poisons LLM web agents' memory using only environmental observations, demonstrating cross-site and cross-session compromise without direct memory access…

View →
cs.CRcs.LGRecentApr 25, 2026

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

Kexin Chu

The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…

View →
cs.CRcs.AIcs.CLRecentMay 4, 2026

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more

The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…

View →
cs.CRcs.AIRecentMay 14, 2026

MemLineage: Lineage-Guided Enforcement for LLM Agent Memory

Ciyan Ouyang, Rui Hou

MemLineage introduces a novel, cryptographically-backed defense mechanism that enforces a chain-of-custody for LLM agent memory, preventing untrusted or poisoned state from justifying sensitive action…

View →
cs.CRcs.AIcs.LGRecentMay 28, 2026

Honeyval: A Comprehensive Evaluation Framework for LLM-powered HTTP Honeypots

Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more

The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…

View →
cs.CRcs.AIcs.LGRecentMay 28, 2026

Honeyval: A Comprehensive Evaluation Framework for LLM-powered HTTP Honeypots

Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more

The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…

View →
cs.CRcs.AIcs.CLRecentApr 17, 2026

A Survey on the Security of Long-Term Memory in LLM Agents: Toward Mnemonic Sovereignty

Zehao Lin, Chunyu Li, Kai Chen

This survey establishes persistent, writable memory as an independent security problem for LLM agents, proposing a comprehensive framework for 'mnemonic sovereignty' to govern the entire memory lifecy…

View →
cs.CRcs.AIRecentMar 20, 2026

Memory poisoning and secure multi-agent systems

Vicenç Torra, Maria Bras-Amorós

This paper analyzes memory poisoning attacks targeting multi-agent systems (MAS) powered by LLMs, proposing mitigation strategies across various memory types, especially focusing on secure design prin…

View →
cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…

View →
cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…

View →
cs.CRcs.AIRecentMay 14, 2026

Hidden in Memory: Sleeper Memory Poisoning in LLM Agents

Sidharth Pulipaka, Stanislau Hlebik, Leonidas Raghav, Sahar Abdelnabi +3 more

The paper introduces and evaluates 'sleeper memory poisoning,' a delayed adversarial attack that corrupts an LLM agent's persistent memory by manipulating external context, demonstrating that these po…

View →
cs.CRRecentMar 28, 2026

Red-MIRROR: Agentic LLM-based Autonomous Penetration Testing with Reflective Verification and Knowledge-augmented Interaction

Tran Vy Khang, Nguyen Dang Nguyen Khang, Nghi Hoang Khoa, Do Thi Thu Hien +2 more

Red-MIRROR is a novel multi-agent LLM system that automates complex web penetration testing by integrating a memory-reflection backbone, achieving superior performance on industry benchmarks.

View →
cs.CRcs.AIcs.SERecentMay 21, 2026

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian +7 more

The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk…

View →
cs.CRcs.AIRecentMay 4, 2026

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

Javad Forough, Marios Kogias, Hamed Haddadi

This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…

View →
cs.CRcs.AIcs.DCRecentMay 31, 2026

AMP: A Vendor-Neutral Wire Format for Agent Memory Operations

Thamilvendhan Munirathinam

The paper introduces memorywire, a vendor-neutral JSON-Schema wire format and reference implementation designed to standardize and govern memory operations across disparate agent-memory frameworks.

View →