Papers similar to 2606.05787v1

~ similar to 2606.05787v1· 20 results

cs.CRcs.AIRecentApr 1, 2026

RAGShield: Detecting Numerical Claim Manipulation in Government RAG Systems

RAGShield introduces a novel, pattern-based defense system that accurately detects subtle numerical claim manipulation in government RAG systems, overcoming the inherent blind spot of embedding-based…

View →

cs.CRcs.AIRecentApr 9, 2026

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more

This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…

View →

cs.CRcs.CLcs.LGRecentMay 7, 2026

Architecture Matters: Comparing RAG Systems under Knowledge Base Poisoning

Samuel Korn

The paper evaluates four RAG architectures under knowledge base poisoning, demonstrating that advanced architectures significantly improve robustness against adversarial contradictions, localizing the…

View →

cs.CRcs.AIRecentApr 22, 2026

Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

Pranav Pallerla, Wilson Naik Bhukya, Bharath Vemula, Charan Ramtej Kodi

The paper proposes the Sentinel-Strategist architecture, an adaptive defense mechanism that selectively deploys security measures in Retrieval-Augmented Generation (RAG) systems to significantly reduc…

View →

cs.CRcs.AIRecentMar 23, 2026

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more

This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…

View →

cs.CRcs.IRcs.LGRecentMay 13, 2026

VectorSmuggle: Steganographic Exfiltration in Embedding Stores and a Cryptographic Provenance Defense

Jascha Wanger

The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…

View →

cs.CRRecentMay 23, 2026

Five Queries Are Enough: Query-Efficient and Surrogate-Free Membership Inference Attacks on RAG via Entailment

Nguyen Linh Bao Nguyen, Wanlun Ma, Viet Vo, Alsharif Abuadbba +3 more

The paper introduces MEntA, a highly query-efficient and surrogate-free membership inference attack that uses natural-language entailment to detect if a specific document was used by a RAG system, ach…

View →

cs.CRcs.AIcs.CLRecentApr 12, 2026

Detecting RAG Extraction Attack via Dual-Path Runtime Integrity Game

Yuanbo Xie, Yingjie Zhang, Yulin Li, Shouyou Song +4 more

The paper introduces CanaryRAG, a novel dual-path runtime defense mechanism that detects RAG Knowledge Base Leakage attacks by embedding canary tokens into retrieved knowledge chunks.

View →

cs.CRcs.LGRecentMay 1, 2026

CleanBase: Detecting Malicious Documents in RAG Knowledge Databases

Weifei Jin, Xilong Wang, Wei Zou, Jinyuan Jia +1 more

CleanBase is a method that detects malicious documents in RAG knowledge databases by identifying clusters (cliques) of documents that exhibit unusually high semantic similarity.

View →

cs.CRcs.AIRecentMay 11, 2026

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

Peiru Yang, Haoran Zheng, Tong Ju, Shiting Wang +5 more

The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attac…

View →

cs.CLRecentMay 28, 2026

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more

The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…

View →

cs.CRcs.AIRecentMay 26, 2026

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Zhe Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong +3 more

The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly…

View →

cs.CRcs.CLcs.LGRecentMay 12, 2026

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

Tom Sander, Hongyan Chang, Tomáš Souček, Tuan Tran +9 more

TextSeal is a novel, non-overhead, and robust watermark for LLMs that enables accurate provenance tracking and detection of unauthorized use even after model distillation.

View →

cs.CYcs.CLcs.CRRecentApr 15, 2026

Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking

Alexander Nemecek, Osama Zafar, Yuqiao Xu, Wenbiao Li +1 more

The paper argues that current AI content watermarking benchmarks fail to test for bias across different languages, cultures, and demographics, proposing a new set of evaluation standards to ensure fai…

View →

cs.CRcs.DCRecentMay 25, 2026

An Efficient and Privacy-Preserving Architecture for Cross-Institutional Collaborative RAG

Chenxin Mao, Shangyu Liu, Zhenzhe Zheng, Fan Wu +2 more

The paper introduces FedRAG, a novel federated RAG framework that enables privacy-preserving cross-institutional knowledge collaboration by decoupling the self-attention mechanism from data localizati…

View →

cs.CRcs.CLRecentMay 22, 2026

Robust LLM Watermarking with Minimal Semantic Distortion for IP Protection

Kieu Dang, Phung Lai, NhatHai Phan, Yelong Shen +1 more

The paper proposes SAFESEAL, a novel key-conditioned watermarking framework that embeds robust, provider-specific watermarks into LLM outputs with minimal semantic distortion, effectively protecting i…

View →

cs.CRcs.AIRecentMay 1, 2026

E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems

Zelin Guan, Shengda Zhuo, Zeyan Li, Jinchun He +3 more

E-MIA introduces a novel, stealthy black-box membership inference attack that converts verifiable hard evidence within a candidate document into an objective, multi-part exam score to determine if the…

View →

cs.CRcs.LGRecentMay 21, 2026

RADAR: Defending RAG Dynamically against Retrieval Corruption

Ziyuan Chen, Yueming Lyu, Yi Liu, Weixiang Han +3 more

The paper proposes RADAR, a novel graph-based framework that dynamically defends Retrieval-Augmented Generation (RAG) systems against evolving adversarial attacks while minimizing storage overhead.

View →

cs.CRcs.AIRecentApr 13, 2026

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more

The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…

View →

cs.CRcs.AIRecentMay 21, 2026

Safeguarding Text-to-Image Generative Models Against Unauthorized Knowledge Distillation

Yilan Gao, Sida Huang, Hongyuan Zhang, Xuelong Li

The paper introduces WaveGuard, a frequency-aware, single-pass defense framework that safeguards text-to-image models by injecting structured, imperceptible perturbations into generated images, thereb…

View →