The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by embedding malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current VLM-driven GUI agents.
Mobile graphical user interface (GUI) agents driven by vision-language models (VLMs) perceive the screen as rendered pixels and choose actions from what they see, so they cannot reliably separate trusted interface elements from user-generated content. We present MIRAGE (Mobile Injection of Realistic Adversarial GUI Examples), a pipeline that turns benign mobile screenshots into prompt-injection samples by placing attacker-controlled text into ordinary user-generated content regions, without modifying the agent, the application, or the operating system. MIRAGE operates in three stages: a Localizer identifies user-controllable regions on the screenshot, a Generator synthesises context-aware payloads and renders them in the application's native style, and a Curator moderates realism and balances the samples across applications, region types, and attack intents. A key challenge is that an injected screenshot must stay visually indistinguishable from genuine user content while still diverting the agent; we address this by separating the stages that control reach, realism, and distributional balance. On a 1,111-sample benchmark spanning ten applications and eleven attack intents, all five evaluated VLM agents are vulnerable, with attack success rates of 23%-30%, and MIRAGE scores higher on human realism ratings than the strongest prior attack (3.02 versus 2.52 out of 5). We further find that per-sample realism and attack success are uncorrelated, so visual-quality filtering alone cannot reliably defend against this threat.
AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-ba…
The paper introduces AgentRAE, a novel backdoor attack that successfully forces…
Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element Injection
The paper introduces Semantic-level UI Element Injection, a novel red-teaming te…
CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Correctiv…
The paper introduces ReCAP, a native GUI agent that significantly improves CAPTC…
WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents
The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard m…
Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization
The paper proposes Trajectory Induced Preference Optimization (TIPO) to improve…
Prompt Control-Flow Integrity: A Priority-Aware Runtime Defense Against Prompt Injection in LLM Syst…
The paper introduces Prompt Control-Flow Integrity (PCFI), a priority-aware runt…
Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injecti…
The paper proposes a vision for system-level defenses against indirect prompt in…
ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Inject…
ClawGuard is a novel runtime security framework that deterministically enforces…