~ similar to 2604.25486v1· 19 results
XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…
The paper proposes an efficient and provably secure linguistic steganography method using range coding that achieves high embedding capacity and speed, outperforming existing methods.
The paper proposes a provably secure steganography scheme based on list decoding that significantly increases embedding capacity for Large Language Models (LLMs) compared to existing methods.
The paper introduces LUNA, a linguistically adaptive watermarking technique that achieves high detection accuracy across diverse languages while maintaining minimal text distortion, outperforming exis…
Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more
The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…
This paper develops provably undetectable and robust watermarking schemes for LLM outputs even when the per-token entropy is only constant, removing previous dependencies on high entropy rates or larg…
The paper proposes SSG, a novel logit-balanced vocabulary partitioning method, to enhance the watermark strength and detectability of LLM-generated content, especially in low-entropy domains like code…
PASA introduces a robust, semantic-level watermarking technique that embeds and detects watermarks in the latent embedding space, successfully resisting semantic-invariant attacks like paraphrasing.
The paper introduces SeedHijack, a novel, undetectable supply-chain attack that biases LLM watermarking signals by hijacking the underlying Pseudo-Random Number Generator (PRNG) without altering the g…
The paper introduces SeedHijack, a novel, undetectable supply-chain attack that biases LLM watermarking signals by hijacking the underlying PRNG, thereby amplifying the watermark without altering the…
The paper introduces BREW, a novel framework that significantly improves the reliability of multi-bit text watermarking for LLMs by replacing flawed decoding-centric methods with a designated two-stag…
Yuqing Nie, Chong Wang, Guosheng Xu, Guoai Xu +3 more
MATRIX is a novel, robust code watermarking framework that encodes watermarks using constrained parity-check matrix equations, achieving high detection accuracy and improved robustness for code proven…
Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more
The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…
Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more
This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.
Tom Sander, Hongyan Chang, Tomáš Souček, Tuan Tran +9 more
TextSeal is a novel, non-overhead, and robust watermark for LLMs that enables accurate provenance tracking and detection of unauthorized use even after model distillation.
Xiaokun Luan, Yihao Zhang, Pengcheng Su, Feiran Lei +1 more
VOW introduces a novel, privacy-preserving, and cryptographically verifiable protocol for detecting watermarks in LLM-generated text, overcoming the limitations of centralized and non-verifiable exist…
The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…
The paper introduces an entropy-aware masking strategy for Masked Language Modeling (MLM) that targets informative and uncertain tokens, achieving up to a 5% performance improvement on GLUE scores.
The paper analyzes the robustness of current LLM watermarking schemes against various text modifications, concluding that watermarks can be removed with reasonable effort.