~ similar to 2604.15829v1· 20 results
The paper proposes Neighbor-Aware Localized Concept Erasure (NLCE), a training-free framework that effectively removes specific concepts from text-to-image models while minimizing the unintended degra…
CoreUnlearn introduces a novel framework that disentangles and removes undesirable concepts from text-guided diffusion models by targeting specific, erasure-critical components of the concept embeddin…
Yuhao Sun, Lingyun Yu, Haoxiang Xu, Fengyuan Miao +2 more
The paper proposes Orthogonal Concept Erasure (OCE), a novel multiplicative parameter update method that achieves precise concept erasure in diffusion models while independently preserving overall gen…
The paper proposes AHV-D&S, a novel training-free inference-time safeguard that detects and suppresses risky content in Diffusion Transformers (DiTs) by quantifying token sensitivity across attention…
Aishwarya Agrawal, Roy Hirsch, Yasumasa Onoe, Sherry Ben +1 more
The paper introduces TECCI, a novel and challenging benchmark dataset of 7550 image-edit pairs, and demonstrates that current state-of-the-art text-guided image editing models struggle significantly w…
Chaoshuo Zhang, Yibo Liang, Mengke Tian, Chenhao Lin +5 more
This paper introduces TwoHamsters, a new benchmark that rigorously tests Multi-Concept Compositional Unsafety (MCCU) in text-to-image models, demonstrating that current state-of-the-art models and saf…
The paper introduces GEM, an effective concept erasure framework for Rectified Flow Transformers, by unifying trajectory-based unlearning with classic teacher-guided flow matching.
The paper introduces 'contrastive privacy,' a formal, model-agnostic, and quantitative method for evaluating the semantic success of AI-based sanitization across multiple media modalities.
The paper introduces WaveGuard, a frequency-aware, single-pass defense framework that safeguards text-to-image models by injecting structured, imperceptible perturbations into generated images, thereb…
XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…
The paper introduces TGIF2, an extended dataset and benchmark that evaluates the forensic robustness of image forgery detection methods against modern, advanced text-guided inpainting techniques.
The paper introduces a robust, two-part framework (HyPE and HyPS) using hyperbolic geometry to efficiently detect and sanitize malicious prompts targeting Vision-Language Models (VLMs).
Kai Wang, Jiale Zhang, Chengcheng Zhu, Chuang Ma +1 more
The paper proposes Hydra, a framework to stabilize and control the injection of multiple, conflicting backdoor triggers into text-to-image diffusion models, ensuring high attack reliability while main…
Desen Sun, Jason Hon, Howe Wang, Saarth Rajan +2 more
This paper investigates a novel security vulnerability where imperceptible branding hints can be injected into images and subsequently re-rendered onto new objects by generative AI models, proposing b…
The paper introduces ImageProtector, a user-side method that embeds an imperceptible perturbation into images to prevent Multi-modal Large Language Models (MLLMs) from analyzing and extracting sensiti…
The paper proposes a disentangled representation framework to significantly improve few-shot layout-to-image generation by separating semantic identity from local visual details, thereby mitigating re…
Jinghuai Zhang, Pengyue Yu, Zhexiao Lin, Kunlin Cai +2 more
ImageAuditor introduces a novel Membership Inference Attack (MIA) specifically designed for Image-based Retrieval-Augmented Generation (IRAG) systems, achieving high accuracy by addressing cross-modal…
The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.
Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more
The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…
F. Carichon, S. Sharma, M. Girard, R. Rampa +1 more
The paper introduces IDEAFix, a systematic evaluation framework designed to analyze how structured prompting and task design influence the divergent thinking and originality of idea generation in LLMs…