~ similar to 2603.30017v1· 19 results
The paper argues that watermarking must be viewed as a monitoring primitive, introducing an observer-based threat model that shows even zero-bit watermarking can enable entity-level attribution throug…
The paper proposes a simple, generic attack strategy—re-watermarking—that reliably suppresses existing watermarks, demonstrating that watermarks can be used to attack other watermarks.
The paper introduces a theoretically grounded evaluation framework for watermarking generative models, proposing a novel method (SSB) that allows for systematic design across all security-robustness-f…
Kieu Dang, Phung Lai, NhatHai Phan, Yelong Shen +1 more
The paper proposes SAFESEAL, a novel key-conditioned watermarking framework that embeds robust, provider-specific watermarks into LLM outputs with minimal semantic distortion, effectively protecting i…
Gaussian Shannon proposes a novel watermarking framework that treats diffusion generation as a noisy communication channel, enabling both robust tracing and exact bit-level recovery of embedded waterm…
Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more
The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…
Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more
The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…
PASA introduces a robust, semantic-level watermarking technique that embeds and detects watermarks in the latent embedding space, successfully resisting semantic-invariant attacks like paraphrasing.
TimeMark proposes a trustworthy time watermarking framework that uses cryptographic techniques and error-correcting codes to achieve 100% accurate recovery of the generation time from AIGC, resisting…
The paper proposes a novel global sketch-based watermarking technique for diffusion language models that controls the entire sequence's statistics, offering an order-agnostic and context-independent a…
The paper demonstrates that current AI watermark removal techniques fail to achieve true forensic stealth, as the removal process often leaves behind detectable signals that distinguish the output fro…
Zikang Ding, Junhao Li, Suling Wu, Junchi Yao +2 more
The paper proposes Functional Subspace Watermarking (FSW), a robust method that embeds ownership signals into a stable, low-dimensional functional subspace of LLMs, significantly improving detection a…
This paper develops provably undetectable and robust watermarking schemes for LLM outputs even when the per-token entropy is only constant, removing previous dependencies on high entropy rates or larg…
The paper proposes a novel binomial multibit LLM watermarking scheme that encodes every bit of a payload at every token position, achieving superior message accuracy and robustness compared to existin…
Andreas Müller, Denis Lukovnikov, Shingo Kodama, Minh Pham +4 more
This paper analyzes existing watermarking schemes for autoregressive image generators and demonstrates that they are vulnerable to various removal and forgery attacks, suggesting they are unreliable f…
This paper compares modern and classic post-hoc watermarking methods, concluding that classic techniques offer superior security and robustness in realistic scenarios compared to modern neural network…
Hanbo Huang, Xuan Gong, Yiran Zhang, Hao Zheng +1 more
The paper introduces RLSpoofer, a lightweight, black-box reinforcement learning attack that demonstrates the fragile resilience of current LLM watermarking schemes by achieving a high spoofing success…
Leyi Qi, Yiming Li, Siyuan Liang, Zhengzhong Tu +1 more
The paper proposes Cert-LAS, a novel certified method for verifying model ownership in text-to-image diffusion models, which is robust against malicious signal removal attacks.
Xiaokun Luan, Yihao Zhang, Pengcheng Su, Feiran Lei +1 more
VOW introduces a novel, privacy-preserving, and cryptographically verifiable protocol for detecting watermarks in LLM-generated text, overcoming the limitations of centralized and non-verifiable exist…