Diffusion Models

Generative models, DDPM, score-based, and image synthesis

20 papers indexed

cs.CVEmpiricalRecentJun 30, 2026

GEAR: Guided End-to-End AutoRegression for Image Synthesis

Bin Lin, Zheyuan Liu, Chenguo Lin, Sixiang Chen +7 more

This paper introduces GEAR, a method for training a vector-quantized tokenizer and an autoregressive generator jointly and end-to-end, resolving the issue of non-differentiable VQ indices.

View →

cs.CVcs.AIcs.LGRecentMay 30, 2026

Improving Visual Representation Alignment Generation with GRPO

Shentong Mo, Sukmin Yun

The paper proposes VRPO, a reinforcement learning-based optimization strategy that replaces static alignment losses in diffusion models, significantly improving both convergence and image fidelity.

View →

cs.CVcs.AIcs.LGRecentMay 28, 2026

Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion Models

Arunkumar Kannan, Yanbo Zhang, Han Liu, Michael Baumgartner +4 more

The paper introduces a histogram-regularized latent diffusion model to synthesize highly realistic and subtype-specific pulmonary nodules in 3D CT volumes, addressing the limitations of existing metho…

View →

cs.CVcs.AIRecentJun 1, 2026

Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization

Yiming Wang, Baiqi Wu, Qingming Li, Jiahao Chen +2 more

The paper proposes FLAME, a novel framework that detects AI-generated image forgeries by identifying intrinsic energy anomalies caused by the diffusion process, achieving state-of-the-art localization…

View →

cs.CVRecentJun 1, 2026

Improving Combined Detection and Classification of TEM Defects via Mask-Conditioned Latent Diffusion Augmentation

Ni Li, Nuohao Liu, Ryan Jacobs, Ajay Annamareddy +4 more

The paper proposes using a mask-conditioned latent diffusion model to generate synthetic, labeled TEM images for data augmentation, achieving small but measurable performance improvements in defect de…

View →

cs.CVcs.AIcs.CRRecentMar 25, 2026

When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm

Ye Leng, Junjie Chu, Mingjie Li, Chenhao Lin +4 more

The paper analyzes that while multimodal large language models (MLLMs) offer superior semantic understanding for image generation, this enhanced capability significantly increases safety risks, partic…

View →

cs.LGcs.AIRecentMay 29, 2026

Beyond Augmentation: Score-Guided Pathological Prior for EEG-based Depression Detection

Xiaojing Chen, Jingqi Cheng, Xu Zhao, Wan Jiang +1 more

The paper introduces Score-Guided Classification (SGC), a novel framework that uses an unsupervised anomaly score as a 'Pathological Prior' to guide EEG-based depression detection, overcoming the limi…

View →

cs.CRRecentApr 21, 2026

Dual-Guard: Dual-Channel Latent Watermarking for Provenance and Tamper Localization in Diffusion Images

JinFeng Xie, Chengfu Ou, Peipeng Yu, Xiaoyu Zhou +4 more

Dual-Guard introduces a dual-channel latent watermarking framework that simultaneously embeds global provenance and localized content anchors into diffusion images, achieving robust detection against…

View →

cs.CVcs.CRRecentMar 31, 2026

SHIFT: Stochastic Hidden-Trajectory Deflection for Removing Diffusion-based Watermark

Rui Bao, Zheng Gao, Xiaoyu Li, Xiaoyan Feng +2 more

The paper introduces SHIFT, a training-free attack that exploits the vulnerability of diffusion-based watermarking by stochastically deflecting the generative trajectory, achieving high removal rates…

View →

cs.CVEmpiricalRecentJul 7, 2026

From RGB Generation to Dense Field Readout: Pixel-Space Dense Prediction with Text-to-Image Models

Zanyi Wang, Xin Lin, Haodong Li, Dengyang Jiang +2 more

This paper proposes ReChannel, a method for dense prediction using a pretrained DiT model, which keeps the encoder but removes the decoder and adapts it with task LoRA. ReChannel maps each token to it…

View →

cs.CVEmpiricalRecentJul 24, 2026

Twins: Learn to Predict Unified Representations with Focal Loss

Kaixiong Gong, Xin Cai, Bin Lin, Hao Wang +8 more

This paper proposes Twins, a unified continuous token space for multimodal models using ViT and VAE features, and addresses optimization imbalance with a focal regression objective.

View →

stat.MLcs.LGTheoreticalRecentJul 18, 2026

Semi-Supervised Conditional Diffusion via Label Augmentation

Jin Su, Yuan Gao, Yong Zhou, Jian Huang

The paper introduces Label-Augmented Conditional Diffusion (LACD), a method for learning complex conditional distributions using unlabeled data, and provides theoretical guarantees for its effectivene…

View →

cs.LGcs.CVEmpiricalRecentJul 20, 2026

Three-Body Scattering for Generative Modeling

Peng Sun, Zhenglin Cheng, Deyuan Liu, Jun Xie +2 more

This paper introduces a new approach for high-dimensional one-step generation using a Three-Body Scattering Model (TBSM) with a proper distributional energy.

View →

cs.LGcs.CRRecentMay 5, 2026

Graph Reconstruction from Differentially Private GNN Explanations

Rishi Raj Sahoo, Jyotirmaya Shivottam, Subhankar Mishra

This paper introduces an attack, PRIVX, demonstrating that even differentially private (DP) Graph Neural Network (GNN) explanations leak enough structural information to allow an adversary to accurate…

View →

cs.AIcs.LGRecentMay 29, 2026

From Noise to Control: Parameterized Diffusion Policies

Renhao Zhang, Haotian Fu, Mingxi Jia, George Konidaris +2 more

The Parameterized Diffusion Policy (PDP) framework transforms diffusion models from general stochastic generators into precise, steerable tools for learning and adapting complex robotic behaviors by e…

View →

cs.CVcs.AIRecentJun 1, 2026

Fast and Lightweight Novel View Synthesis with Differentiable Multiplane Image

Kaidi Zhang, Guanxu Zhu

The paper proposes a fast and lightweight novel view synthesis method using a differentiable Multiplane Image (MPI) representation, achieving significant speed and size improvements over state-of-the-…

View →

cs.AIcs.CRRecentMay 11, 2026

diffGHOST: Diffusion based Generative Hedged Oblivious Synthetic Trajectories

Florent Guépin, Cheick Tidiani Cisse, Denis Renaud, François Bidet +1 more

The paper introduces diffGHOST, a conditional diffusion model that generates synthetic, privacy-preserving mobility trajectories by explicitly mitigating sample memorization in the latent space.

View →

cs.CVcs.AIRecentMay 29, 2026

DiffCrossGait: Trajectory-Level Alignment for 2D-3D Cross-Modal Gait Recognition via Latent Diffusion

Zhiyang Lu, Ming Cheng

DiffCrossGait proposes a novel trajectory-level alignment method using latent diffusion to overcome domain discrepancies in 2D-3D gait recognition, achieving state-of-the-art performance.

View →

eess.AScs.AIcs.LGEmpiricalRecentJun 18, 2026

Repurposing a Speech Classifier for Guided Diffusion-Based Speech Generation

Rostislav Makarov, Timo Gerkmann

The paper repurposes a pre-trained speech classifier as the backbone for diffusion generation, reducing the need for two separately trained models.

View →

cs.AIRecentJun 1, 2026

Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization

Yusuke Ohtsubo, Kota Dohi, Koichiro Yawata, Koki Takeshita +1 more

The paper proposes a visual program synthesis framework using a VLM to generate accurate training data for semiconductor inspection, mitigating the sim-to-real gap by applying input binarization to st…

View →