~ similar to 2605.28999v1· 20 results
Mohan Zhang, Yuqi Jia, Zhen Tan, Steven Jiang +3 more
This study provides the first large-scale measurement of prompt injection attacks in real-world LLM-based resume screening, finding that approximately 1% of resumes contain hidden injections.
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…
The security of LLM agents is critically dependent on their system prompt configuration, which creates a brittle attack surface that can be exploited by attackers inverting the prompt's core assumptio…
This paper provides a large-scale empirical analysis of indirect prompt injections found in webpages, revealing that prompt-based interference is a widespread, persistent, and growing threat targeting…
The paper evaluates prompt-injection defenses for educational LLM tutors, demonstrating that optimal security requires balancing adversarial robustness, usability, and latency, and proposing a compreh…
The paper introduces Prompt Control-Flow Integrity (PCFI), a priority-aware runtime defense that models LLM prompts as structured segments to intercept prompt injection attacks with high accuracy and…
AttackEval systematically evaluates the effectiveness of 250 prompt injection prompts across ten attack categories, finding that composite and obfuscation attacks are highly effective against current…
Hang Li, Fedor Filippov, Yuling Lin, Pengfei He +5 more
This paper investigates the vulnerability of LLM-based automatic grading systems to prompt injection (PI) attacks, demonstrating that current systems are highly susceptible to manipulation that can le…
Priyal Deep, Shane Emmons, Amy Fox, Kyle Bacon +3 more
The paper evaluates prompt injection defenses and finds that only external output filtering, implemented in application code, reliably prevents secret leaks from LLMs, demonstrating that model-based d…
The paper proposes a multi-layered security framework to detect and mitigate SQL injection attacks that occur when Large Language Models translate natural language prompts into database queries.
This paper proposes a multi-layered defense strategy combining pre-output monitoring, calibrated canary detection, and cumulative information-flow tracking to prevent LLM agents from exfiltrating sens…
The paper proposes CASCADE, a novel three-tiered, fully local defense architecture for detecting prompt injection and tool poisoning attacks in Model Context Protocol (MCP)-based LLM systems, achievin…
The paper introduces 'log-substrate prompt injection,' demonstrating that attacker-controlled log fields can be used to manipulate LLM-powered security analysis, with persona hijacking and context man…
The paper introduces dynamic, per-request separator generation for Polymorphic Prompt Assembling (PPA), significantly reducing the blast-radius vulnerability to prompt injection attacks by ensuring un…
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
The paper empirically analyzes the susceptibility of seven widely used AI-assisted development tools (MCP clients) to prompt injection via tool-poisoning, revealing significant disparities in their se…
Runpeng Geng, Chenlong Yin, Yanting Wang, Ying Chen +1 more
The paper introduces PIArena, a unified and extensible platform designed to address the lack of standardized evaluation for prompt injection, revealing critical limitations in current state-of-the-art…
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
This paper introduces seven novel, cross-domain techniques for detecting prompt injection attacks, moving beyond the limitations of traditional regex and transformer classifiers.