Papers similar to 2605.30686

~ similar to 2605.30686· 20 results

cs.CRcs.AIcs.LGRecentMay 29, 2026

Depth-Dependent Indirect Prompt Injection in Tool-Calling ReAct Agents: Injection Depth, Payload Framing, and Turn-Budget Sensitivity

This paper investigates indirect prompt injection vulnerabilities in ReAct agents by systematically analyzing how the injection depth and payload framing affect attack success rates, finding that inje…

View →

cs.CRcs.AIRecentMay 18, 2026

LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection

Lei Zhao, Abhay Bhaskar, Edgar Dobriban

The paper introduces LivePI, a structured, production-like benchmark that rigorously tests the vulnerability of AI agents to indirect prompt injection across multiple real-world input surfaces, reveal…

View →

cs.CRcs.AIRecentMay 8, 2026

WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation

Zhichao Liu, Wenbo Pan, Haining Yu, Ge Gao +2 more

WebTrap introduces a stealthy, mid-task hijacking attack that successfully compromises browser agents during long-horizon tasks by seamlessly fusing malicious instructions with the original user goal.

View →

cs.CRcs.AIRecentMar 25, 2026

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Yulin Shen, Xudong Pan, Geng Hong, Min Yang

The paper introduces Tree structured Injection for Payloads (TIP), a novel black-box attack framework that reliably generates stealthy injection payloads to seize control of LLM agents utilizing the M…

View →

cs.CRcs.AIRecentMay 28, 2026

The Surface You Test Is Not the Surface That Breaks

Shifat E Arman, Syed Nazmus Sakib, Nafiul Haque, Shahrear Bin Amin

The vulnerability of LLM agents to prompt injection depends not on the specific channel (tool output vs. tool description) but on the interaction between the model and the surface.

View →

cs.CRcs.AIRecentMay 28, 2026

The Surface You Test Is Not the Surface That Breaks

Shifat E Arman, Syed Nazmus Sakib, Nafiul Haque, Shahrear Bin Amin

The vulnerability of LLM agents to prompt injection depends not on the specific channel (tool output vs. tool description) but on the interaction between the model and the surface itself.

View →

cs.CRcs.CLcs.CYRecentMay 17, 2026

AI Agents May Always Fall for Prompt Injections

Sahar Abdelnabi, Eugene Bagdasarian

The paper argues that prompt injection is a fundamental vulnerability in AI agents, proposing that Contextual Integrity (CI) offers a principled framework to understand and mitigate context-sensitive…

View →

cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…

View →

cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…

View →

cs.CRcs.AIRecentMay 17, 2026

ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents

Udari Madhushani Sehwag, Zhengyang Shan, Heming Liu, Dileepa Lakshan +2 more

The paper introduces ASPI, a benchmark showing that requiring LLM agents to seek clarification significantly amplifies their vulnerability to prompt injection attacks.

View →

cs.CRRecentApr 4, 2026

AttackEval: A Systematic Empirical Study of Prompt Injection Attack Effectiveness Against Large Language Models

Jackson Wang

AttackEval systematically evaluates the effectiveness of 250 prompt injection prompts across ten attack categories, finding that composite and obfuscation attacks are highly effective against current…

View →

cs.CRRecentApr 29, 2026

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

Soheil Khodayari, Xuenan Zhang, Bhupendra Acharya, Giancarlo Pellegrino

This paper provides a large-scale empirical analysis of indirect prompt injections found in webpages, revealing that prompt-based interference is a widespread, persistent, and growing threat targeting…

View →

cs.CRRecentMay 25, 2026

AgentSecBench: Measuring Prompt Injection, Privacy Leakage, and Tool-Use Integrity in LLM Agents

Faruk Alpay, Taylan Alpay

The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…

View →

cs.CRcs.AIRecentMar 31, 2026

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa +4 more

The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.

View →

cs.CRRecentApr 27, 2026

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more

AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…

View →

cs.CRcs.CLcs.CVRecentApr 9, 2026

Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element Injection

Wenkui Yang, Chao Jin, Haisu Zhu, Weilin Luo +6 more

The paper introduces Semantic-level UI Element Injection, a novel red-teaming technique that overlays misleading UI elements onto screenshots to significantly improve the attack success rate against s…

View →

cs.CRcs.AIcs.CLRecentJun 3, 2026

Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming

Nicholas Saban

The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…

View →

cs.CRcs.AIRecentApr 13, 2026

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Wei Zhao, Zhe Li, Peixin Zhang, Jun Sun

ClawGuard is a novel runtime security framework that deterministically enforces user-confirmed rules at tool-call boundaries to protect LLM agents from indirect prompt injection.

View →

cs.CRcs.LGRecentMay 23, 2026

Poisoning the Watchtower: Prompt Injection Attacks Against LLM-Augmented Security Operations Through Adversarial Log Content

Rohan Pandey, Archit Bhujang

The paper introduces 'log-substrate prompt injection,' demonstrating that attacker-controlled log fields can be used to manipulate LLM-powered security analysis, with persona hijacking and context man…

View →

cs.CRRecentApr 11, 2026

PlanGuard: Defending Agents against Indirect Prompt Injection via Planning-based Consistency Verification

Guangyu Gong, Zizhuang Deng

PlanGuard is a training-free defense framework that uses an isolated Planner and hierarchical verification to defend LLM agents against Indirect Prompt Injection by verifying the consistency of planne…

View →