Papers similar to 2604.01635v1

~ similar to 2604.01635v1· 20 results

cs.CRcs.AIRecentMay 2, 2026

VisInject: Disruption != Injection -- A Dual-Dimension Evaluation of Universal Adversarial Attacks on Vision-Language Models

The paper introduces a dual-dimension evaluation for universal adversarial attacks on Vision-Language Models (VLMs), demonstrating that high reported attack success rates significantly overestimate th…

View →

cs.CRRecentMay 11, 2026

Generate "Normal", Edit Poisoned: Branding Injection via Hint Embedding in Image Editing

Desen Sun, Jason Hon, Howe Wang, Saarth Rajan +2 more

This paper investigates a novel security vulnerability where imperceptible branding hints can be injected into images and subsequently re-rendered onto new objects by generative AI models, proposing b…

View →

cs.CRRecentJun 1, 2026

On Improving Robustness of Deepfake Image Detectors

Abu Taib Mohammed Shahjahan, Mohammad Mannan, Abdessamad Ben Hamza, Amr Youssef

The paper proposes a unified, architecture-agnostic framework that significantly improves the robustness of deepfake image detectors against adversarial attacks by focusing on higher-order frequency s…

View →

cs.CVcs.AIRecentJun 1, 2026

Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization

Yiming Wang, Baiqi Wu, Qingming Li, Jiahao Chen +2 more

The paper proposes FLAME, a novel framework that detects AI-generated image forgeries by identifying intrinsic energy anomalies caused by the diffusion process, achieving state-of-the-art localization…

View →

cs.CRcs.LGRecentMay 13, 2026

DiffusionHijack: Supply-Chain PRNG Backdoor Attack on Diffusion Models and Quantum Random Number Defense

Ziyang You, Liling Zheng, Xiaoke Yang, Xuxing Lu

The paper introduces DiffusionHijack, a supply-chain backdoor attack that compromises the PRNG used by diffusion models to deterministically control generated images, which is successfully mitigated b…

View →

cs.CRRecentApr 12, 2026

SEED: A Large-Scale Benchmark for Provenance Tracing in Sequential Deepfake Facial Edits

Mengieong Hoi, Zhedong Zheng, Ping Liu, Wei Liu

The paper introduces SEED, a large-scale benchmark dataset for tracing sequential deepfake facial edits, and proposes FAITH, a frequency-aware Transformer model that effectively detects and orders the…

View →

cs.CRRecentApr 21, 2026

Dual-Guard: Dual-Channel Latent Watermarking for Provenance and Tamper Localization in Diffusion Images

JinFeng Xie, Chengfu Ou, Peipeng Yu, Xiaoyu Zhou +4 more

Dual-Guard introduces a dual-channel latent watermarking framework that simultaneously embeds global provenance and localized content anchors into diffusion images, achieving robust detection against…

View →

cs.CVcs.AIRecentJun 1, 2026

Suppressing Forgery-Specific Shortcuts for Generalizable Deepfake Detection

Yihui Wang, Yonghui Yang, Jilong Liu, Fengbin Zhu +2 more

The paper proposes the Shortcut Subspace Suppression (S^3) framework to improve deepfake detection generalization by explicitly identifying and suppressing method-specific shortcuts in learned feature…

View →

cs.CRcs.AIRecentApr 30, 2026

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

Prashant Kulkarni

The paper introduces 'adversarial restlessness,' an activation-level signature in LLM residual streams, to detect multi-turn prompt injection attacks with high accuracy.

View →

cs.CRRecentMay 1, 2026

Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift

Adam Arthur, Christopher Schwartz

The paper demonstrates that off-the-shelf image diffusion models, like Stable Diffusion, can be repurposed to generate synthetic structured data, posing a threat of ground truth drift in closed eviden…

View →

cs.CRcs.DCRecentMay 15, 2026

PCDM: A Diffusion-Based Data Poisoning Attack Against Federated Learning Systems

Wei Sun, Yijun Chen, Bo Gao, Ke Xiong +3 more

The paper proposes PCDM, a diffusion-based framework that enables highly stealthy and effective data poisoning attacks against Federated Learning systems, significantly degrading global performance wh…

View →

cs.CVcs.CRcs.LGRecentMay 14, 2026

Systematic Discovery of Semantic Attacks in Online Map Construction through Conditional Diffusion

Chenyi Wang, Ruoyu Song, Raymond Muller, Jean-Philippe Monteuuis +4 more

The paper introduces MIRAGE, a framework that systematically discovers semantic attacks on online HD map construction by finding plausible environmental variations that bypass standard adversarial def…

View →

cs.CRcs.CVRecentMay 15, 2026

A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation

Hao Yang, Zhuo Ma, Yang Liu, Yilong Yang +2 more

The paper introduces CrossMPI, a novel cross-modal prompt injection attack that uses image-only perturbations to steer the interpretation of both textual and visual inputs in Large Vision-Language Mod…

View →

cs.CRcs.CVRecentApr 14, 2026

Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling

Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li +2 more

The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign a…

View →

cs.CVcs.AIcs.CRRecentMar 30, 2026

TGIF2: Extended Text-Guided Inpainting Forgery Dataset & Benchmark

Hannes Mareen, Dimitrios Karageorgiou, Paschalis Giakoumoglou, Peter Lambert +2 more

The paper introduces TGIF2, an extended dataset and benchmark that evaluates the forensic robustness of image forgery detection methods against modern, advanced text-guided inpainting techniques.

View →

cs.CRcs.AIcs.LGRecentMay 15, 2026

GenAI-FDIA: Physics-Informed Generative Models for False Data Injection Attacks

Mohammad A. Razzaque, Muta Tah Hira

The paper introduces GenAI-FDIA, a comprehensive framework that benchmarks various physics-informed generative models to synthesize high-fidelity False Data Injection Attacks (FDIA) for power systems,…

View →

cs.CRcs.LGRecentMay 19, 2026

Awakening the Hydra: Stabilizing Multi-Concept Backdoor Injection in Text-to-Image Diffusion Models

Kai Wang, Jiale Zhang, Chengcheng Zhu, Chuang Ma +1 more

The paper proposes Hydra, a framework to stabilize and control the injection of multiple, conflicting backdoor triggers into text-to-image diffusion models, ensuring high attack reliability while main…

View →

cs.CRRecentMay 18, 2026

On the Geometric Limits of Transformer Defenses against Obfuscation Attacks: Latent Embedding Collapse & Performance Robustness Gap

Becky Mashaido, Tapadhir Das

The paper demonstrates that high detection performance against obfuscated prompts does not guarantee representational robustness, identifying a phenomenon called latent embedding collapse.

View →

cs.CVcs.AIcs.CRRecentApr 10, 2026

Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection

Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong

The paper introduces ImageProtector, a user-side method that embeds an imperceptible perturbation into images to prevent Multi-modal Large Language Models (MLLMs) from analyzing and extracting sensiti…

View →

cs.CVcs.CRRecentMar 31, 2026

SHIFT: Stochastic Hidden-Trajectory Deflection for Removing Diffusion-based Watermark

Rui Bao, Zheng Gao, Xiaoyu Li, Xiaoyan Feng +2 more

The paper introduces SHIFT, a training-free attack that exploits the vulnerability of diffusion-based watermarking by stochastically deflecting the generative trajectory, achieving high removal rates…

View →