~ similar to 2606.03354v1· 20 results
Zelin Guan, Shengda Zhuo, Zeyan Li, Jinchun He +3 more
E-MIA introduces a novel, stealthy black-box membership inference attack that converts verifiable hard evidence within a candidate document into an objective, multi-part exam score to determine if the…
This paper demonstrates that retrieval-augmented in-context learning systems for document QA are vulnerable to membership inference attacks, proposing novel black-box methods that exploit query prefix…
Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more
This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…
Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…
Haozhen Wang, Haoyue Liu, Jionghao Zhu, Zhichao Wang +2 more
The paper introduces PIDP-Attack, a novel compound adversarial attack that combines prompt injection with database poisoning to manipulate Retrieval-Augmented Generation (RAG) systems against arbitrar…
Nguyen Linh Bao Nguyen, Wanlun Ma, Viet Vo, Alsharif Abuadbba +3 more
The paper introduces MEntA, a highly query-efficient and surrogate-free membership inference attack that uses natural-language entailment to detect if a specific document was used by a RAG system, ach…
The paper proposes an unsupervised method using multiple statistical indicators to detect adversarial or compromised context documents in Retrieval Augmented Generation (RAG) systems, even without kno…
The paper introduces ImageProtector, a user-side method that embeds an imperceptible perturbation into images to prevent Multi-modal Large Language Models (MLLMs) from analyzing and extracting sensiti…
Chengcai Gao, Zhihong Sun, Xiaochuan Shi, Qiufeng Wang +1 more
The paper proposes BiRD, a bidirectional ranking defense mechanism that enhances the robustness of Retrieval-Augmented Generation (RAG) against adversarial attacks by analyzing the alignment between f…
SilentRetrieval introduces a sophisticated, two-stage data poisoning attack that successfully hijacks Retrieval-Augmented Generation (RAG) systems by injecting adversarially crafted, yet highly fluent…
The paper introduces WaveGuard, a frequency-aware, single-pass defense framework that safeguards text-to-image models by injecting structured, imperceptible perturbations into generated images, thereb…
Peiru Yang, Haoran Zheng, Tong Ju, Shiting Wang +5 more
The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attac…
This paper presents a black-box membership inference attack (MIA) against Video Large Language Models (VideoLLMs), demonstrating that they are vulnerable by analyzing generation behavior across varyin…
Zeyuan Chen, Yihan Ma, Xinyue Shen, Michael Backes +1 more
The PopQuiz Attack is a novel black-box membership inference attack that successfully tests whether large language models memorize specific training data by framing the target data as multiple-choice…
Yiyang Zhang, Chaojian Yu, Ziming Hong, Yuanjie Shao +3 more
The paper proposes a novel Text-Guided Backdoor (TGB) attack that uses common words in text descriptions as stealthy triggers for multimodal models, enhancing practicality and controllability.
RAGShield introduces a novel, pattern-based defense system that accurately detects subtle numerical claim manipulation in government RAG systems, overcoming the inherent blind spot of embedding-based…
The paper introduces a novel, transferable learned attack (LT-MIA) that detects a universal 'signature of memorization' in language models, achieving high accuracy across diverse model architectures (…
Alexander Nemecek, Osama Zafar, Yuqiao Xu, Wenbiao Li +1 more
The paper argues that current AI content watermarking benchmarks fail to test for bias across different languages, cultures, and demographics, proposing a new set of evaluation standards to ensure fai…
This paper introduces a novel attack, RA-ICA, that targets RAG-enhanced LLMs by poisoning external knowledge bases to drastically increase inference costs, achieving up to a 13.12x increase in token c…
Paul Jünger, Justin Lovelace, Linxi Zhao, Dongyoung Go +1 more
The paper introduces SARDI, a novel, training-free framework that uses low-confidence 'lookahead' tokens generated during the denoising process of discrete diffusion language models to dynamically gui…