ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2603.19469v1· 20 results

cs.CRcs.SERecentMay 5, 2026

ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection

Shihao Weng, Yang Feng, Jinrui Zhang, Xiaofei Xie +2 more

The paper introduces ARGUS, a defense mechanism that uses provenance-aware decision auditing to protect LLM agents from sophisticated, context-aware prompt injection attacks, significantly reducing th…

View →
cs.CRRecentMay 16, 2026

Securing LLM Agents Need Intent-to-Execution Integrity

Wenjie Qu, Ming Xu, Peiran Wang, Shengfang Zhai +2 more

The paper proposes defining 'intent-to-execution integrity' as the necessary end-to-end correctness property for securing LLM agents, arguing that current defenses are insufficient due to untrusted co…

View →
cs.CRRecentMay 9, 2026

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions

Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu +2 more

The paper introduces CAESAR, a novel multi-agent framework that coordinates LLM agents across five specialized roles to improve success rates and stability in complex, multi-stage cyber intrusion task…

View →
cs.CRcs.CLcs.CYRecentMay 17, 2026

AI Agents May Always Fall for Prompt Injections

Sahar Abdelnabi, Eugene Bagdasarian

The paper argues that prompt injection is a fundamental vulnerability in AI agents, proposing that Contextual Integrity (CI) offers a principled framework to understand and mitigate context-sensitive…

View →
cs.CRRecentMay 25, 2026

AgentSecBench: Measuring Prompt Injection, Privacy Leakage, and Tool-Use Integrity in LLM Agents

Faruk Alpay, Taylan Alpay

The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…

View →
cs.CRRecentApr 27, 2026

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more

AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…

View →
cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…

View →
cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…

View →
cs.CRRecentApr 11, 2026

PlanGuard: Defending Agents against Indirect Prompt Injection via Planning-based Consistency Verification

Guangyu Gong, Zizhuang Deng

PlanGuard is a training-free defense framework that uses an isolated Planner and hierarchical verification to defend LLM agents against Indirect Prompt Injection by verifying the consistency of planne…

View →
cs.CRcs.AIRecentMar 31, 2026

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa +4 more

The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.

View →
cs.CRcs.CLRecentMay 11, 2026

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments

Chiyu Zhang, Huiqin Yang, Bendong Jiang, Xiaolei Zhang +7 more

The paper introduces LITMUS, a novel benchmark that rigorously tests LLM agents for dangerous, physical-layer behavioral jailbreaks in real OS environments, revealing that current agents frequently ex…

View →
cs.CRcs.LGRecentApr 25, 2026

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

Kexin Chu

The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…

View →
cs.CRRecentMay 23, 2026

Reframing LLM Agent Security as an Agent-Human Interaction Problem

Peiran Wang, Ying Li, Yuan Tian

The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…

View →
cs.CRcs.AIRecentMay 22, 2026

When the Manual Lies: A Realistic Benchmark to Evaluate MCP Poisoning Attacks for LLM Agents

Shi Liu, Xuehai Tang, Xikang Yang, Liang Lin +3 more

This paper introduces a new benchmark to test Tool Description Poisoning (TDP) attacks on LLM agents, demonstrating that even advanced models like GPT-4o are highly vulnerable and that current defense…

View →
cs.CRcs.AIcs.SERecentMay 21, 2026

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian +7 more

The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk…

View →
cs.CRRecentMay 14, 2026

Toward Securing AI Agents Like Operating Systems

Lukas Pirch, Micha Horlboge, Patrick Großmann, Syeda Mahnur Asif +3 more

This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…

View →
cs.CRcs.AIRecentApr 13, 2026

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Wei Zhao, Zhe Li, Peixin Zhang, Jun Sun

ClawGuard is a novel runtime security framework that deterministically enforces user-confirmed rules at tool-call boundaries to protect LLM agents from indirect prompt injection.

View →
cs.CRcs.AIRecentMar 30, 2026

Evaluating Privilege Usage of Agents with Real-World Tools

Quan Zhang, Lianhang Fu, Lvsi Lian, Gwihwan Go +4 more

The paper introduces GrantBox, a new security sandbox that evaluates how well LLM agents handle real-world tool privileges, finding that agents remain highly vulnerable to sophisticated attacks.

View →
cs.CRcs.AIRecentMay 7, 2026

PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts

Qinfeng Li, Yuntai Bao, Jianghui Hu, Wenqi Zhang +4 more

PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.

View →
cs.CRcs.AIRecentApr 7, 2026

A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms

Nirajan Acharya, Gaurav Kumar Gupta

The paper introduces MCPSHIELD, a comprehensive formal security framework that systematically characterizes and provides a defense-in-depth architecture for the rapidly adopted but insecure Model Cont…

View →