~ similar to 2605.13764v1· 20 results
Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more
This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…
Yu Liu, Kun Peng, Wenxiao Zhang, Fangfang Yuan +3 more
Trans-RAG introduces a novel query-centric vector transformation technique to enable secure, efficient, and accurate cross-organizational retrieval in RAG systems without plaintext decryption.
Hanxi Li, Jianan Zhou, Jiale Lao, Yibo Wang +4 more
The paper introduces the Black-Hole Attack, a poisoning vulnerability that exploits geometric defects in high-dimensional embedding spaces to force malicious vectors into the top-k results of vector d…
Tom Sander, Hongyan Chang, Tomáš Souček, Tuan Tran +9 more
TextSeal is a novel, non-overhead, and robust watermark for LLMs that enables accurate provenance tracking and detection of unauthorized use even after model distillation.
This paper proposes a multi-layered defense strategy combining pre-output monitoring, calibrated canary detection, and cumulative information-flow tracking to prevent LLM agents from exfiltrating sens…
Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more
The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…
RAGShield introduces a novel, pattern-based defense system that accurately detects subtle numerical claim manipulation in government RAG systems, overcoming the inherent blind spot of embedding-based…
SEAL-Tag is a privacy-preserving runtime environment that mitigates PII leakage in Retrieval-Augmented Generation (RAG) systems by enforcing verifiable evidence aggregation and structured auditing.
Maosen Zhang, Jianshuo Dong, Boting Lu, Wenyue Li +3 more
The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to d…
Tsun On Kwok, Xi Yang, Ki Sen Hung, Chang Liu +1 more
SentinelRAG introduces a novel watermarking framework that embeds style-consistent, fictitious knowledge entries into RAG databases, allowing for reliable detection of unauthorized redistribution whil…
Yan Liang, Ziyuan Yang, Mengyu Sun, Joey Tianyi Zhou +1 more
The paper proposes SubPopMark, a novel subpopulation-driven framework that injects harmless, verifiable markers into distilled datasets to prevent copyright infringement and data leakage.
The paper introduces a novel framework using steganographic canary files to detect and block unauthorized processing of sensitive documents by LLMs, even when the data passes through traditional secur…
The paper introduces Compositional Semantic Fingerprinting (CSF), a black-box method that allows IP owners to attribute fine-tuned text-to-image models to their protected lineages using only query acc…
Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…
The paper proposes an unsupervised method using multiple statistical indicators to detect adversarial or compromised context documents in Retrieval Augmented Generation (RAG) systems, even without kno…
The paper introduces MosaicLeaks, a benchmark demonstrating that deep research agents querying external sources can leak private information from their local documents, and proposes PA-DR to mitigate…
This paper re-evaluates prompt-injection attacks in realistic RAG settings, finding that most prior attack methods fail to reach the generator, and that current attacks are easily detectable.
Haobo Zhang, Xutao Mao, Guangyuan Dong, Ziwei Li +4 more
MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all t…
Yuqing Nie, Chong Wang, Guosheng Xu, Guoai Xu +3 more
MATRIX is a novel, robust code watermarking framework that encodes watermarks using constrained parity-check matrix equations, achieving high detection accuracy and improved robustness for code proven…
Xinlei Guan, David Arosemena, Tejaswi Dhandu, Kuan Huang +6 more
The paper proposes an end-to-end forensic pipeline using steganographic attribution and multimodal harm detection to reliably trace and attribute harmful misuse of AI-generated imagery on social platf…