~ similar to 2606.01843· 19 results
The paper proposes a unified, architecture-agnostic framework that significantly improves the robustness of deepfake image detectors against adversarial attacks by focusing on higher-order frequency s…
The paper introduces TGIF2, an extended dataset and benchmark that evaluates the forensic robustness of image forgery detection methods against modern, advanced text-guided inpainting techniques.
This paper proposes a 3D CNN detector that leverages temporal artifacts to accurately identify high-quality deepfake videos, demonstrating robust detection even after social media re-encoding.
Yifan Liao, Yule Liu, Zhen Sun, Zongmin Zhang +4 more
The paper introduces MARS, a novel meta-adversarial framework that significantly improves black-box adversarial attacks against state-of-the-art Singing Voice Deepfake Detection (SVDD) systems by esca…
The study demonstrates that robust, domain-invariant representations of synthetic deception can be rapidly entrenched in LLMs using modest fine-tuning, detectable by linear probes even in early layers…
The paper introduces a new adaptive jailbreak attack (JB-GCG) that successfully bypasses the state-of-the-art JBShield defense, and proposes a more robust defense (RTV) based on multi-layer representa…
Ke Liu, Jiwei Wei, Wenyu Zhang, Shuchang Zhou +4 more
The paper introduces a new dataset (SHDF) and a framework (T-AVFD) to robustly detect audio-visual deepfakes, specifically addressing the challenge posed by singing vocalizations.
GAFSV-Net introduces a novel 2D vision framework by encoding temporal signature data into a six-channel Gramian Angular Field image, significantly improving online signature verification accuracy over…
Zikang Ding, Junhao Li, Suling Wu, Junchi Yao +2 more
The paper proposes Functional Subspace Watermarking (FSW), a robust method that embeds ownership signals into a stable, low-dimensional functional subspace of LLMs, significantly improving detection a…
Hanbo Huang, Xuan Gong, Yiran Zhang, Hao Zheng +1 more
The paper introduces RLSpoofer, a lightweight, black-box reinforcement learning attack that demonstrates the fragile resilience of current LLM watermarking schemes by achieving a high spoofing success…
The paper demonstrates that current AI watermark removal techniques fail to achieve true forensic stealth, as the removal process often leaves behind detectable signals that distinguish the output fro…
This paper systematically diagnoses the failure modes of linear deception probes in LLMs, finding that while single-direction probes are insufficient, multi-dimensional probes can recover robust detec…
Mathias Graf, Marco Willi, Melanie Mathys, Michael Aerni +3 more
DeepSignature proposes a novel, cryptographically verifiable watermarking system that uses deep neural networks to embed digital signatures into images, enabling robust source attribution and near 100…
LiteGuard proposes an efficient task-agnostic model fingerprinting framework that achieves enhanced generalization and significantly reduces computational overhead compared to existing methods like Me…
BackFlush introduces a novel, knowledge-free framework that detects and eliminates unknown backdoor attacks in LLMs while simultaneously preserving existing watermarks, achieving high detection rates…
Yue Li, Linying Xue, Kaiqing Lin, Hanyu Quan +4 more
The paper proposes AEGIS, a novel diffusion-guided method for injecting adversarial perturbations into the latent space to create generalizable and robust defenses against advanced facial deepfake man…
The paper introduces DECKER, a domain-invariant framework that significantly improves cross-keyboard keystroke inference by normalizing device variations and leveraging linguistic context, demonstrati…
Bo Wang, Jia Ni, Mengnan Zhao, Zhan Qin +1 more
This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Sem…
The paper proposes a simple, generic attack strategy—re-watermarking—that reliably suppresses existing watermarks, demonstrating that watermarks can be used to attack other watermarks.