Papers similar to 2605.07481v1

~ similar to 2605.07481v1· 20 results

cs.CRcs.AIRecentMay 9, 2026

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

PASA introduces a robust, semantic-level watermarking technique that embeds and detects watermarks in the latent embedding space, successfully resisting semantic-invariant attacks like paraphrasing.

View →

cs.CRRecentApr 13, 2026

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

Hanbo Huang, Xuan Gong, Yiran Zhang, Hao Zheng +1 more

The paper introduces RLSpoofer, a lightweight, black-box reinforcement learning attack that demonstrates the fragile resilience of current LLM watermarking schemes by achieving a high spoofing success…

View →

cs.CLcs.AIcs.CRRecentApr 6, 2026

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang

XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…

View →

cs.CRcs.CLRecentMay 22, 2026

Robust LLM Watermarking with Minimal Semantic Distortion for IP Protection

Kieu Dang, Phung Lai, NhatHai Phan, Yelong Shen +1 more

The paper proposes SAFESEAL, a novel key-conditioned watermarking framework that embeds robust, provider-specific watermarks into LLM outputs with minimal semantic distortion, effectively protecting i…

View →

cs.CLRecentMay 28, 2026

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more

The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…

View →

cs.CLcs.AIRecentMay 30, 2026

Linguistics-Aware Non-Distortionary LLM Watermarking

Shinwoo Park, Hyejin Park, Hyeseon An, Yo-Sub Han

The paper introduces LUNA, a linguistically adaptive watermarking technique that achieves high detection accuracy across diverse languages while maintaining minimal text distortion, outperforming exis…

View →

cs.CRcs.AIRecentMar 19, 2026

Functional Subspace Watermarking for Large Language Models

Zikang Ding, Junhao Li, Suling Wu, Junchi Yao +2 more

The paper proposes Functional Subspace Watermarking (FSW), a robust method that embeds ownership signals into a stable, low-dimensional functional subspace of LLMs, significantly improving detection a…

View →

cs.CRRecentApr 13, 2026

Can we Watermark Low-Entropy LLM Outputs?

Noam Mazor, Andrew Morgan, Rafael Pass

This paper develops provably undetectable and robust watermarking schemes for LLM outputs even when the per-token entropy is only constant, removing previous dependencies on high entropy rates or larg…

View →

cs.CRcs.CLcs.LGRecentMay 12, 2026

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

Tom Sander, Hongyan Chang, Tomáš Souček, Tuan Tran +9 more

TextSeal is a novel, non-overhead, and robust watermark for LLMs that enables accurate provenance tracking and detection of unauthorized use even after model distillation.

View →

cs.CRcs.CLcs.LGRecentMay 28, 2026

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking across Datasets, Models, and Generated Content

Bing Liu, Shunping Wang, Yufan Zhu, Xinyi Yu +4 more

This paper introduces 'implicit identity' as a unifying framework to survey and categorize LLM fingerprinting and watermarking techniques for verifying ownership and provenance across datasets, models…

View →

cs.CRcs.AIRecentApr 13, 2026

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more

The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…

View →

cs.CRcs.AIcs.CLRecentMay 28, 2026

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

Yuexin Li, Wenjie Qu, Linyu Wu, Yulin Chen +4 more

AliMark proposes a novel framework that enhances the robustness of sentence-level watermarking by reformulating the problem as a bit sequence encoding and alignment task, significantly improving resil…

View →

cs.CRcs.AIcs.CLRecentMay 28, 2026

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

Yuexin Li, Wenjie Qu, Linyu Wu, Yulin Chen +4 more

AliMark proposes a novel watermarking framework that treats sentence-level watermarking as a bit sequence alignment problem, significantly enhancing robustness against structural text perturbations li…

View →

cs.CRcs.AIRecentMar 30, 2026

Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey

Bhavuk Jain, Sercan Ö. Arık, Hardeo K. Thakur

This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…

View →

cs.CRcs.CLRecentMay 1, 2026

Block-wise Codeword Embedding for Reliable Multi-bit Text Watermarking

Joeun Kim, HoEun Kim, Dongsup Jin, Young-Sik Kim

The paper introduces BREW, a novel framework that significantly improves the reliability of multi-bit text watermarking for LLMs by replacing flawed decoding-centric methods with a designated two-stag…

View →

cs.CRRecentMay 4, 2026

VertMark: A Unified Training-Free Robust Watermarking Framework for Vertical Domain Pre-trained Language Models

Cong Kong, Xin Cheng, Zhaoxia Yin, Shuai Li +2 more

VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specializ…

View →

cs.CRcs.AIRecentApr 22, 2026

Text Steganography with Dynamic Codebook and Multimodal Large Language Model

Jianxin Gao, Ruohan Lei, Wanli Peng

The paper proposes a secure and practical black-box text steganography method that uses a dynamic codebook and a multimodal LLM to embed secret messages into captions, outperforming existing technique…

View →

cs.CRRecentApr 30, 2026

VOW: Verifiable and Oblivious Watermark Detection for Large Language Models

Xiaokun Luan, Yihao Zhang, Pengcheng Su, Feiran Lei +1 more

VOW introduces a novel, privacy-preserving, and cryptographically verifiable protocol for detecting watermarks in LLM-generated text, overcoming the limitations of centralized and non-verifiable exist…

View →

cs.CLcs.AIcs.CRRecentMay 5, 2026

SWAN: Semantic Watermarking with Abstract Meaning Representation

Ziping Ye, Gourab Dey, Christos Christodoulopoulos, Charith Peris +6 more

SWAN introduces a novel, training-free framework that embeds watermarks directly into the semantic structure of a sentence using Abstract Meaning Representation (AMR), achieving superior robustness ag…

View →

cs.CRcs.AIcs.CYRecentMar 24, 2026

Robust Safety Monitoring of Language Models via Activation Watermarking

Toluwani Aremu, Daniil Ognev, Samuele Poppi, Nils Lukas

This paper addresses the vulnerability of existing LLM safety monitors to adaptive attackers and proposes activation watermarking, a technique that significantly improves detection robustness against…

View →