Papers similar to 2603.25164v1

~ similar to 2603.25164v1· 20 results

cs.CRcs.AIRecentMar 23, 2026

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more

This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…

View →

cs.CRcs.CLcs.IRRecentMay 27, 2026

SilentRetrieval: Hijacking Retrieval-Augmented Generation via Semantically-Preserving Adversarial Data Poisoning

Jiachen Qian

SilentRetrieval introduces a sophisticated, two-stage data poisoning attack that successfully hijacks Retrieval-Augmented Generation (RAG) systems by injecting adversarially crafted, yet highly fluent…

View →

cs.CRRecentApr 8, 2026

RefineRAG: Word-Level Poisoning Attacks via Retriever-Guided Text Refinement

Ziye Wang, Guanyu Wang, Kailong Wang

RefineRAG introduces a novel word-level poisoning framework that significantly enhances knowledge poisoning attacks against RAG systems, achieving state-of-the-art effectiveness and transferability to…

View →

cs.CRcs.AIRecentMay 11, 2026

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

Peiru Yang, Haoran Zheng, Tong Ju, Shiting Wang +5 more

The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attac…

View →

cs.CRcs.AIRecentApr 9, 2026

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more

This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…

View →

cs.CRcs.CLcs.LGRecentMay 7, 2026

Architecture Matters: Comparing RAG Systems under Knowledge Base Poisoning

Samuel Korn

The paper evaluates four RAG architectures under knowledge base poisoning, demonstrating that advanced architectures significantly improve robustness against adversarial contradictions, localizing the…

View →

cs.CRcs.AIRecentMay 11, 2026

When Prompts Become Payloads: A Framework for Mitigating SQL Injection Attacks in Large Language Model-Driven Applications

Farzad Nourmohammadzadeh Motlagh, Mehrdad Hajizadeh, Mehryar Majd, Pejman Najafi +2 more

The paper proposes a multi-layered security framework to detect and mitigate SQL injection attacks that occur when Large Language Models translate natural language prompts into database queries.

View →

cs.CRcs.AIRecentMay 26, 2026

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Zhe Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong +3 more

The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly…

View →

cs.CRcs.AIcs.DBRecentMay 31, 2026

Inference Cost Attacks for Retrieval-Augmented Large Language Models

Chengliang Liu, Liangbo Ning, Yujuan Ding, Wenqi Fan

This paper introduces a novel attack, RA-ICA, that targets RAG-enhanced LLMs by poisoning external knowledge bases to drastically increase inference costs, achieving up to a 13.12x increase in token c…

View →

cs.CRRecentApr 4, 2026

AttackEval: A Systematic Empirical Study of Prompt Injection Attack Effectiveness Against Large Language Models

Jackson Wang

AttackEval systematically evaluates the effectiveness of 250 prompt injection prompts across ten attack categories, finding that composite and obfuscation attacks are highly effective against current…

View →

cs.CRcs.IRRecentMay 27, 2026

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

Yu Yin, Shuai Wang, Bevan Koopman, Guido Zuccon

This paper re-evaluates prompt-injection attacks in realistic RAG settings, finding that most prior attack methods fail to reach the generator, and that current attacks are easily detectable.

View →

cs.CRcs.IRRecentMay 19, 2026

BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

Chengcai Gao, Zhihong Sun, Xiaochuan Shi, Qiufeng Wang +1 more

The paper proposes BiRD, a bidirectional ranking defense mechanism that enhances the robustness of Retrieval-Augmented Generation (RAG) against adversarial attacks by analyzing the alignment between f…

View →

cs.CRcs.AIRecentApr 10, 2026

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Xingyu Lyu, Jianfeng He, Ning Wang, Yidan Hu +4 more

The paper proposes ADAM, a novel and highly effective privacy attack that systematically extracts sensitive data from LLM agent memory by adaptively querying the victim agent's memory based on data di…

View →

cs.CRcs.AIRecentMay 1, 2026

A Sentence Relation-Based Approach to Sanitizing Malicious Instructions

Soumil Datta, Melissa Umble, Daniel S. Brown, Guanhong Tao

The paper introduces SONAR, a prompt sanitization framework that uses natural language inference metrics to identify and remove malicious instructions injected into LLM prompts, achieving near-zero at…

View →

cs.CRRecentApr 14, 2026

DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection

Junyu Ren, Xingjian Pan, Wensheng Gan, Philip S. Yu

The paper introduces PromptFuzz-SC, a novel semantic-character dual-space mutation framework, demonstrating that combining both semantic and character-level attacks significantly improves the robustne…

View →

cs.CRcs.AIcs.CLRecentMay 7, 2026

LeakDojo: Decoding the Leakage Threats of RAG Systems

Maosen Zhang, Jianshuo Dong, Boting Lu, Wenyue Li +3 more

The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to d…

View →

cs.CRcs.AIcs.CLRecentMay 4, 2026

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

Mingshuo Liu, Yiwei Zha, Min Chen

PIIGuard introduces a novel webpage-level defense mechanism using optimized hidden HTML fragments to prevent LLM assistants from scraping contact-style PII, achieving high defense success rates while…

View →

cs.CRcs.AIcs.LGRecentMay 18, 2026

Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks

John T. Halloran, Noopur S. Bhatt

The paper proposes Open-Book Benign Rewriting (OBBR), a novel defense mechanism that uses LLM rewriting with benign samples to neutralize data poisoning attacks against LLMs, significantly improving s…

View →

cs.CRcs.DBRecentMay 3, 2026

Needle-in-RAG: Prompt-Conditioned Character-Level Traceback of Poisoned Spans in Retrieved Evidence

Huining Cui, Wei Liu

The paper introduces RAGCharacter, a forensic framework that enables black-box, character-level traceback to pinpoint the exact poisoned span in retrieved evidence responsible for a misgeneration even…

View →

cs.CRcs.AIRecentJun 2, 2026

"Important You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

Hang Li, Fedor Filippov, Yuling Lin, Pengfei He +5 more

This paper investigates the vulnerability of LLM-based automatic grading systems to prompt injection (PI) attacks, demonstrating that current systems are highly susceptible to manipulation that can le…

View →

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

SilentRetrieval: Hijacking Retrieval-Augmented Generation via Semantically-Preserving Adversarial Data Poisoning

RefineRAG: Word-Level Poisoning Attacks via Retriever-Guided Text Refinement

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Architecture Matters: Comparing RAG Systems under Knowledge Base Poisoning

When Prompts Become Payloads: A Framework for Mitigating SQL Injection Attacks in Large Language Model-Driven Applications

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Inference Cost Attacks for Retrieval-Augmented Large Language Models

AttackEval: A Systematic Empirical Study of Prompt Injection Attack Effectiveness Against Large Language Models

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

A Sentence Relation-Based Approach to Sanitizing Malicious Instructions

DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection

LeakDojo: Decoding the Leakage Threats of RAG Systems

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks

Needle-in-RAG: Prompt-Conditioned Character-Level Traceback of Poisoned Spans in Retrieved Evidence

"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

SilentRetrieval: Hijacking Retrieval-Augmented Generation via Semantically-Preserving Adversarial Data Poisoning

RefineRAG: Word-Level Poisoning Attacks via Retriever-Guided Text Refinement

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Architecture Matters: Comparing RAG Systems under Knowledge Base Poisoning

When Prompts Become Payloads: A Framework for Mitigating SQL Injection Attacks in Large Language Model-Driven Applications

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Inference Cost Attacks for Retrieval-Augmented Large Language Models

AttackEval: A Systematic Empirical Study of Prompt Injection Attack Effectiveness Against Large Language Models

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

A Sentence Relation-Based Approach to Sanitizing Malicious Instructions

DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection

LeakDojo: Decoding the Leakage Threats of RAG Systems

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks

Needle-in-RAG: Prompt-Conditioned Character-Level Traceback of Poisoned Spans in Retrieved Evidence

"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

"Important You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

"Important You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems