~ similar to 2605.08442v1· 20 results
Debeshee Das, Julien Piet, Darya Kaviani, Luca Beurer-Kellner +2 more
The paper introduces Trojan Hippo, a persistent memory attack that exfiltrates sensitive data from LLM agents by planting dormant payloads into long-term memory, and develops a comprehensive framework…
Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more
This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.
MemLineage introduces a novel, cryptographically-backed defense mechanism that enforces a chain-of-custody for LLM agent memory, preventing untrusted or poisoned state from justifying sensitive action…
Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more
MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…
The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…
The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…
The paper introduces uGen, the first LLM-driven framework that uses a retrieval-augmented, multi-agent design to automatically generate functionally correct microarchitectural attack Proof-of-Concepts…
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more
The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…
This survey establishes persistent, writable memory as an independent security problem for LLM agents, proposing a comprehensive framework for 'mnemonic sovereignty' to govern the entire memory lifecy…
The paper introduces and evaluates 'sleeper memory poisoning,' a delayed adversarial attack that corrupts an LLM agent's persistent memory by manipulating external context, demonstrating that these po…
Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian +7 more
The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk…
Hammad Atta, Ken Huang, Kyriakos Rock Lambros, Yasir Mehmood +10 more
The paper introduces LAAF, a novel automated red-teaming framework, to systematically test and exploit Logic-layer Prompt Control Injection (LPCI) vulnerabilities in complex agentic LLM systems.
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…
Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more
This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…
The paper introduces Session Risk Memory (SRM), a lightweight module that enhances per-action authorization gates with trajectory-level risk assessment, significantly improving detection of distribute…
Yuanbo Xie, Yingjie Zhang, Yulin Li, Shouyou Song +4 more
The paper introduces CanaryRAG, a novel dual-path runtime defense mechanism that detects RAG Knowledge Base Leakage attacks by embedding canary tokens into retrieved knowledge chunks.
This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…