~ similar to 2605.12942v2· 19 results
Haobo Zhang, Xutao Mao, Guangyuan Dong, Ziwei Li +4 more
MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all t…
The paper argues that watermarking must be viewed as a monitoring primitive, introducing an observer-based threat model that shows even zero-bit watermarking can enable entity-level attribution throug…
The paper introduces WaveGuard, a frequency-aware, single-pass defense framework that safeguards text-to-image models by injecting structured, imperceptible perturbations into generated images, thereb…
Leyi Qi, Yiming Li, Siyuan Liang, Zhengzhong Tu +1 more
The paper proposes Cert-LAS, a novel certified method for verifying model ownership in text-to-image diffusion models, which is robust against malicious signal removal attacks.
Zhiyang Dai, Yansong Gao, Boyu Kuang, Haodong Li +4 more
This paper repurposes the statistical signals from data-poisoning backdoor attacks on contrastive learning (CL) models to create a multi-level, effective watermarking scheme for dataset intellectual p…
Guang Yang, Amir Ghasemian, Fengchen Liu, Zhong Wang +2 more
The paper proposes interaction-layer antidistillation watermarks by embedding behavioral markers into the system prompt, which successfully track knowledge distillation even when paraphrasing attacker…
Zibo Diao, Jingchu Gai, Xinyue Ai, Zhang Zhang +2 more
The paper introduces Lossless Anti-Distillation Sampling (LADS), a novel sampling scheme that makes harvested data correlated for malicious distillers while ensuring benign users receive statistically…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
Cong Kong, Xin Cheng, Zhaoxia Yin, Shuai Li +2 more
VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specializ…
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
Bing Liu, Shunping Wang, Yufan Zhu, Xinyi Yu +4 more
This paper introduces 'implicit identity' as a unifying framework to survey and categorize LLM fingerprinting and watermarking techniques for verifying ownership and provenance across datasets, models…
The paper introduces Compositional Semantic Fingerprinting (CSF), a black-box method that allows IP owners to attribute fine-tuned text-to-image models to their protected lineages using only query acc…
The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…
Xinlei Guan, David Arosemena, Tejaswi Dhandu, Kuan Huang +6 more
The paper proposes an end-to-end forensic pipeline using steganographic attribution and multimodal harm detection to reliably trace and attribute harmful misuse of AI-generated imagery on social platf…
The paper introduces FLIPS, an instance-level fingerprinting technique that exploits biases in generated random sequences to accurately distinguish between different configurations of the same Large L…
Hanna Foerster, Ilia Shumailov, Cheng Zhang, Yiren Zhao +2 more
This paper identifies a critical privacy vulnerability, termed Quantamination, where dynamic quantization in popular ML frameworks can leak sensitive user data across batch boundaries.
The paper introduces the concept of 'authenticity debt'—the institutional liability from deploying unverified AI content—and proposes a layered reference architecture combining cryptographic provenanc…
The paper introduces the concept of 'authenticity debt'—the institutional liability from deploying unverified AI content—and proposes a layered reference architecture combining cryptographic provenanc…
The paper proposes a novel proof-of-authorship framework for AI-generated content by cryptographically binding the random seed used in latent diffusion model generation to the author's identity, offer…