~ similar to 2605.18414v1· 20 results
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
Jiaqi Luo, Songyang Peng, Jiarun Dai, Zhile Chen +5 more
AgentGuard is an attribute-based access control framework designed to mitigate severe security risks, such as privacy leakage and system compromise, in tool-using LLM-based agents.
ClawGuard is a novel runtime security framework that deterministically enforces user-confirmed rules at tool-call boundaries to protect LLM agents from indirect prompt injection.
Qinfeng Li, Yuntai Bao, Jianghui Hu, Wenqi Zhang +4 more
PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.
The paper introduces the Open Agent Passport (OAP), a deterministic pre-action authorization framework that intercepts and validates AI agent tool calls against a declarative policy, achieving a 0% su…
Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more
VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…
Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more
This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…
Ying Li, Yanju Chen, Peiran Wang, Issac Khabra +3 more
The paper introduces Conleash, a client-side middleware that uses a risk lattice to enforce granular, boundary-scoped authorization for tool invocations, significantly improving user consent and secur…
Shihao Weng, Yang Feng, Jinrui Zhang, Xiaofei Xie +2 more
The paper introduces ARGUS, a defense mechanism that uses provenance-aware decision auditing to protect LLM agents from sophisticated, context-aware prompt injection attacks, significantly reducing th…
The paper empirically analyzes the susceptibility of seven widely used AI-assisted development tools (MCP clients) to prompt injection via tool-poisoning, revealing significant disparities in their se…
Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more
AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…
SecureMCP proposes a novel, policy-enforced framework that integrates Role-Based Access Control (RBAC) with an MCP server to provide multi-layer, fine-grained defense against malicious LLM-generated S…
Xiaochong Jiang, Shiqi Yang, Ziwei Li, Lifei Liu +2 more
ChainCaps introduces a novel runtime capability budgeting system that prevents 'permission laundering' in complex tool-using agents, significantly reducing attack success rates while maintaining benig…
This paper analyzes the security vulnerabilities of the Model Context Protocol (MCP), identifying tool poisoning as the most critical client-side threat, and proposes a multi-layered defense strategy.
The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…
The paper introduces Policy-First Tooling, a model-agnostic permission layer that significantly enhances the safety and reliability of tool-orchestrated AI workflows by enforcing explicit constraints…
The paper introduces TRUSTDESC, a novel framework that prevents tool poisoning attacks in LLM applications by automatically generating highly accurate and trusted tool descriptions directly from the t…
The security of LLM agents is critically dependent on their system prompt configuration, which creates a brittle attack surface that can be exploited by attackers inverting the prompt's core assumptio…