Papers similar to 2606.04486v1

~ similar to 2606.04486v1· 19 results

cs.CRcs.AIRecentMar 19, 2026

Functional Subspace Watermarking for Large Language Models

Zikang Ding, Junhao Li, Suling Wu, Junchi Yao +2 more

The paper proposes Functional Subspace Watermarking (FSW), a robust method that embeds ownership signals into a stable, low-dimensional functional subspace of LLMs, significantly improving detection a…

View →

cs.CRcs.AIRecentMay 9, 2026

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

Zhenxin Ai, Haiyun He

PASA introduces a robust, semantic-level watermarking technique that embeds and detects watermarks in the latent embedding space, successfully resisting semantic-invariant attacks like paraphrasing.

View →

cs.CVcs.AIcs.CRRecentApr 13, 2026

On the Robustness of Watermarking for Autoregressive Image Generation

Andreas Müller, Denis Lukovnikov, Shingo Kodama, Minh Pham +4 more

This paper analyzes existing watermarking schemes for autoregressive image generators and demonstrates that they are vulnerable to various removal and forgery attacks, suggesting they are unreliable f…

View →

cs.CLcs.AIcs.CRRecentApr 6, 2026

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang

XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…

View →

cs.CLRecentMay 28, 2026

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more

The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…

View →

cs.CLcs.AIRecentMay 30, 2026

Linguistics-Aware Non-Distortionary LLM Watermarking

Shinwoo Park, Hyejin Park, Hyeseon An, Yo-Sub Han

The paper introduces LUNA, a linguistically adaptive watermarking technique that achieves high detection accuracy across diverse languages while maintaining minimal text distortion, outperforming exis…

View →

cs.CRcs.AIcs.CLRecentApr 24, 2026

SSG: Logit-Balanced Vocabulary Partitioning for LLM Watermarking

Chenxi Gu, Xiaoning Du, John Grundy

The paper proposes SSG, a novel logit-balanced vocabulary partitioning method, to enhance the watermark strength and detectability of LLM-generated content, especially in low-entropy domains like code…

View →

cs.CLcs.AIcs.CRRecentMay 22, 2026

Extracting Training Data from Diffusion Language Models via Infilling

Yihan Wang, N. Asokan

The paper introduces 'infilling extraction' to accurately model training data memorization in Diffusion Language Models (DLMs), finding that bidirectional masking significantly increases the extractab…

View →

cs.CRRecentMay 4, 2026

VertMark: A Unified Training-Free Robust Watermarking Framework for Vertical Domain Pre-trained Language Models

Cong Kong, Xin Cheng, Zhaoxia Yin, Shuai Li +2 more

VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specializ…

View →

cs.CRcs.CVcs.GRRecentMay 28, 2026

Cert-LAS: Toward Certified Model Ownership Verification for Text-to-Image Diffusion Models via Layer-Adaptive Smoothing

Leyi Qi, Yiming Li, Siyuan Liang, Zhengzhong Tu +1 more

The paper proposes Cert-LAS, a novel certified method for verifying model ownership in text-to-image diffusion models, which is robust against malicious signal removal attacks.

View →

cs.CRRecentMay 28, 2026

LoRA-Key: User-Centric LoRA Watermarking for Text-to-Image Diffusion Models

Yaopeng Wang, Qingliang Wang, Zhibo Wang, Huiyu Xu +4 more

LoRA-Key introduces a user-centric watermarking framework that attaches a recoverable ownership key to LoRA modules via a standalone Watermark LoRA, providing lightweight, plug-and-play copyright prot…

View →

cs.CRcs.AIRecentApr 13, 2026

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more

The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…

View →

cs.CRcs.AIRecentMay 8, 2026

Vaporizer: Breaking Watermarking Schemes for Large Language Model Outputs

Jonathan Hong Jin Ng, Anh Tu Ngo, Anupam Chattopadhyay

The paper analyzes the robustness of current LLM watermarking schemes against various text modifications, concluding that watermarks can be removed with reasonable effort.

View →

cs.CRcs.CLcs.LGRecentMay 12, 2026

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

Tom Sander, Hongyan Chang, Tomáš Souček, Tuan Tran +9 more

TextSeal is a novel, non-overhead, and robust watermark for LLMs that enables accurate provenance tracking and detection of unauthorized use even after model distillation.

View →

cs.CVcs.CRRecentMar 27, 2026

Gaussian Shannon: High-Precision Diffusion Model Watermarking Based on Communication

Yi Zhang, Hongbo Huang, Liang-Jie Zhang

Gaussian Shannon proposes a novel watermarking framework that treats diffusion generation as a noisy communication channel, enabling both robust tracing and exact bit-level recovery of embedded waterm…

View →

cs.CRRecentApr 13, 2026

Can we Watermark Low-Entropy LLM Outputs?

Noam Mazor, Andrew Morgan, Rafael Pass

This paper develops provably undetectable and robust watermarking schemes for LLM outputs even when the per-token entropy is only constant, removing previous dependencies on high entropy rates or larg…

View →

cs.CRRecentMar 18, 2026

Proof-of-Authorship for Diffusion-based AI Generated Content

De Zhang Lee, Han Fang, Ee-Chien Chang

The paper proposes a novel proof-of-authorship framework for AI-generated content by cryptographically binding the random seed used in latent diffusion model generation to the author's identity, offer…

View →

cs.LGcs.CRRecentMay 19, 2026

Backdooring Masked Diffusion Language Models

Daniel Yiming Cao, Chengzhong Wang, Sheng-Yen Chou, Chengyu Huang +2 more

The paper introduces SHADOWMASK, the first systematic backdoor attack targeting Masked Diffusion Language Models (MDLMs), demonstrating near-100% attack success while preserving clean model utility.

View →

cs.CRcs.AIRecentMay 27, 2026

Blind PRNG Hijacking: An Undetectable Integrity-Preserving Attack Against LLM Watermarking

Ziyang You, Huilong He, Xiaoke Yang, Xuxing Lu

The paper introduces SeedHijack, a novel, undetectable supply-chain attack that biases LLM watermarking signals by hijacking the underlying Pseudo-Random Number Generator (PRNG) without altering the g…

View →