~ similar to 2606.05787v1· 20 results
RAGShield introduces a novel, pattern-based defense system that accurately detects subtle numerical claim manipulation in government RAG systems, overcoming the inherent blind spot of embedding-based…
Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…
The paper evaluates four RAG architectures under knowledge base poisoning, demonstrating that advanced architectures significantly improve robustness against adversarial contradictions, localizing the…
The paper proposes the Sentinel-Strategist architecture, an adaptive defense mechanism that selectively deploys security measures in Retrieval-Augmented Generation (RAG) systems to significantly reduc…
Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more
This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…
The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…
Nguyen Linh Bao Nguyen, Wanlun Ma, Viet Vo, Alsharif Abuadbba +3 more
The paper introduces MEntA, a highly query-efficient and surrogate-free membership inference attack that uses natural-language entailment to detect if a specific document was used by a RAG system, ach…
Yuanbo Xie, Yingjie Zhang, Yulin Li, Shouyou Song +4 more
The paper introduces CanaryRAG, a novel dual-path runtime defense mechanism that detects RAG Knowledge Base Leakage attacks by embedding canary tokens into retrieved knowledge chunks.
Weifei Jin, Xilong Wang, Wei Zou, Jinyuan Jia +1 more
CleanBase is a method that detects malicious documents in RAG knowledge databases by identifying clusters (cliques) of documents that exhibit unusually high semantic similarity.
Peiru Yang, Haoran Zheng, Tong Ju, Shiting Wang +5 more
The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attac…
Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more
The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…
Zhe Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong +3 more
The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly…
Tom Sander, Hongyan Chang, Tomáš Souček, Tuan Tran +9 more
TextSeal is a novel, non-overhead, and robust watermark for LLMs that enables accurate provenance tracking and detection of unauthorized use even after model distillation.
Alexander Nemecek, Osama Zafar, Yuqiao Xu, Wenbiao Li +1 more
The paper argues that current AI content watermarking benchmarks fail to test for bias across different languages, cultures, and demographics, proposing a new set of evaluation standards to ensure fai…
Chenxin Mao, Shangyu Liu, Zhenzhe Zheng, Fan Wu +2 more
The paper introduces FedRAG, a novel federated RAG framework that enables privacy-preserving cross-institutional knowledge collaboration by decoupling the self-attention mechanism from data localizati…
Kieu Dang, Phung Lai, NhatHai Phan, Yelong Shen +1 more
The paper proposes SAFESEAL, a novel key-conditioned watermarking framework that embeds robust, provider-specific watermarks into LLM outputs with minimal semantic distortion, effectively protecting i…
Zelin Guan, Shengda Zhuo, Zeyan Li, Jinchun He +3 more
E-MIA introduces a novel, stealthy black-box membership inference attack that converts verifiable hard evidence within a candidate document into an objective, multi-part exam score to determine if the…
Ziyuan Chen, Yueming Lyu, Yi Liu, Weixiang Han +3 more
The paper proposes RADAR, a novel graph-based framework that dynamically defends Retrieval-Augmented Generation (RAG) systems against evolving adversarial attacks while minimizing storage overhead.
Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more
The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…
The paper introduces WaveGuard, a frequency-aware, single-pass defense framework that safeguards text-to-image models by injecting structured, imperceptible perturbations into generated images, thereb…