~ similar to 2605.28298· 20 results
The paper analyzes the robustness of current LLM watermarking schemes against various text modifications, concluding that watermarks can be removed with reasonable effort.
Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more
The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…
Pengcheng Zhou, Pianran Guo, Shuhua Chen, Mengqin Zhao +2 more
The paper proposes Domain-Aware Sharpness Minimization (DASM), a novel optimizer that enhances the robustness and generalization of voice stream steganalysis models across varying data distributions.
The paper introduces LUNA, a linguistically adaptive watermarking technique that achieves high detection accuracy across diverse languages while maintaining minimal text distortion, outperforming exis…
The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…
Zhen Sun, Zongmin Zhang, Leyi Sheng, Yule Liu +6 more
The paper introduces SADBench, a systematic benchmark designed to evaluate both the effectiveness of steganographic attacks injecting harmful content and the robustness of steganalysis defenses agains…
Cong Kong, Xin Cheng, Zhaoxia Yin, Shuai Li +2 more
VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specializ…
PASA introduces a robust, semantic-level watermarking technique that embeds and detects watermarks in the latent embedding space, successfully resisting semantic-invariant attacks like paraphrasing.
This paper investigates the privacy risk of reconstructing Personally Identifiable Information (PII) from Large Language Models (LLMs) that have undergone Supervised Finetuning (SFT), proposing a nove…
The paper introduces BREW, a novel framework that significantly improves the reliability of multi-bit text watermarking for LLMs by replacing flawed decoding-centric methods with a designated two-stag…
The paper proposes SSG, a novel logit-balanced vocabulary partitioning method, to enhance the watermark strength and detectability of LLM-generated content, especially in low-entropy domains like code…
Alexander Nemecek, Osama Zafar, Yuqiao Xu, Wenbiao Li +1 more
The paper argues that current AI content watermarking benchmarks fail to test for bias across different languages, cultures, and demographics, proposing a new set of evaluation standards to ensure fai…
The paper proposes a secure and practical black-box text steganography method that uses a dynamic codebook and a multimodal LLM to embed secret messages into captions, outperforming existing technique…
Xinlei Guan, David Arosemena, Tejaswi Dhandu, Kuan Huang +6 more
The paper proposes an end-to-end forensic pipeline using steganographic attribution and multimodal harm detection to reliably trace and attribute harmful misuse of AI-generated imagery on social platf…
The paper introduces DECKER, a domain-invariant framework that significantly improves cross-keyboard keystroke inference by normalizing device variations and leveraging linguistic context, demonstrati…
The paper introduces ImageProtector, a user-side method that embeds an imperceptible perturbation into images to prevent Multi-modal Large Language Models (MLLMs) from analyzing and extracting sensiti…
Tsun On Kwok, Xi Yang, Ki Sen Hung, Chang Liu +1 more
SentinelRAG introduces a novel watermarking framework that embeds style-consistent, fictitious knowledge entries into RAG databases, allowing for reliable detection of unauthorized redistribution whil…
Shashie Dilhara Batan Arachchige, Hassan Jameel Asghar, Benjamin Zi Hao Zhao, Dinusha Vatsalan +1 more
The paper proposes a character-level differential privacy mechanism to sanitize sensitive user prompts for LLMs, achieving high privacy for PII while maintaining utility for non-sensitive context.
The paper introduces 'contrastive privacy,' a formal, model-agnostic, and quantitative method for evaluating the semantic success of AI-based sanitization across multiple media modalities.
Maofei Chen, Laifu Wang, Yue Qin, Yuan Wang +2 more
The paper demonstrates that using raw source text for fine-tuning LLMs on vulnerability detection causes high false-positive rates by memorizing surface-level syntax, a problem mitigated by using Abst…