Papers similar to 2606.02331

~ similar to 2606.02331· 19 results

cs.LGcs.CVRecentJun 1, 2026

Measurement Geometry and Design for Trustworthy Generative Inverse Problems

The paper proposes a measurement-geometry framework to quantify how well fixed measurement operators can distinguish between images generated by a prior, thereby guiding the design of more trustworthy…

View →

cs.LGcs.AIcs.CVRecentMay 27, 2026

Geometry-Correct Diffusion Posterior Sampling with Denoiser-Pullback Curvature Guidance and Manifold-Aligned Damping

Seunghyeok Shin, Minwoo Kim, Dabin Kim, Hongki Lim

The paper introduces a novel diffusion posterior sampling method that stabilizes and accelerates data-consistent sampling by replacing hand-tuned guidance weights with a per-noise-level, curvature-gui…

View →

cs.CLcs.LGRecentMay 30, 2026

Towards Lightweight Reliability: Using Soft Prompts for Hallucination Mitigation in Large Language Models

S M Tahmid Siddiqui, Akib Jawad Ononto, Anoop Singhal, Latifur Khan

The paper introduces Responsible Contrastive Soft Prompting (RCSP), a parameter-efficient method using soft prompts to improve LLM reliability by simultaneously suppressing hallucinations, encouraging…

View →

cs.CVcs.RORecentJun 1, 2026

Not All Points Are Equal: Uncertainty-Aware 4D LiDAR Scene Synthesis

Xiang Xu, Alan Liang, Youquan Liu, Xian Sun +4 more

The paper introduces U4D, an uncertainty-aware framework that synthesizes 4D LiDAR scenes by prioritizing the reconstruction of geometrically difficult and uncertain regions first, leading to state-of…

View →

cs.CVcs.CLRecentMay 29, 2026

Learning from Fine-Grained Visual Discrepancies: Mitigating Multimodal Hallucinations via In-Context Visual Contrastive Optimization

Haolin Deng, Xin Zou, Zhiwei Jin, Chen Chen +2 more

The paper proposes In-Context Visual Contrastive Optimization (IC-VCO) to rigorously mitigate multimodal hallucinations in Vision-Language Models by optimizing contrastive learning within a shared mul…

View →

cs.CVcs.AIRecentMay 29, 2026

Real2SAM2Real: Generative 3D Caches as Complementary Context for Video Diffusion

Jiayi Wu, Haoming Cai, Cornelia Fermuller, Christopher Metzler +1 more

Real2SAM2Real introduces a framework that uses explicit 3D caches, derived from 3D lifting models, to provide robust geometric guidance to Video Diffusion Models, significantly improving spatiotempora…

View →

cs.ROcs.CVRecentJun 1, 2026

RoboDream: Compositional World Models for Scalable Robot Data Synthesis

Junjie Ye, Rong Xue, Basile Van Hoorick, Runhao Li +5 more

RoboDream introduces an embodiment-centric world model that synthesizes photorealistic, physically feasible robot demonstrations by decoupling motion generation from environment synthesis, significant…

View →

cs.CVcs.AIRecentMay 28, 2026

Mitigating Hallucination in Vision-Language Models through Barrier-Regulated Adaptive Closed-form Steering

Soumyadeep Jana, Pulkit Mittal, Sanasam Ranbir Singh

The paper proposes BRACS, a training-free steering framework that adaptively corrects visual grounding failures in large vision-language models, significantly reducing object hallucination without sac…

View →

cs.LGcs.AImath.OCRecentMay 29, 2026

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Shervin Khalafi, Alejandro Ribeiro, Dongsheng Ding

The paper proposes a unified, constrained optimization framework using KL divergence and likelihood constraints to achieve effective and principled unlearning in diffusion models.

View →

cs.AIRecentMay 27, 2026

Reasoning Matters: Mitigate Hallucination in Multimodal Large Reasoning Models via Reasoning-Conditioned Preference Optimization

Jiawei Kong, Hao Fang, Shunxiang Liao, Jinyu Li +4 more

The paper proposes Reasoning-Conditioned Direct Preference Optimization (RC-DPO) to effectively mitigate hallucinations in multimodal large reasoning models by explicitly conditioning the preference o…

View →

cs.CRRecentMay 1, 2026

Repurposing Image Diffusion Models for Adversarial Synthetic Structured Data: A Case Study of Ground Truth Drift

Adam Arthur, Christopher Schwartz

The paper demonstrates that off-the-shelf image diffusion models, like Stable Diffusion, can be repurposed to generate synthetic structured data, posing a threat of ground truth drift in closed eviden…

View →

cs.CLRecentMay 31, 2026

Revise, Don't Freeze: Sampler-Matched Training for Self-Correcting Masked Diffusion Language Models

Longxuan Yu, Shaorong Zhang, Yu Fu, Hui Liu +2 more

The paper introduces D3IM, a novel parameter-free sampler that enables direct revision of visible tokens in Masked Diffusion Language Models, and proposes SCOPE to mitigate the model's tendency to per…

View →

cs.CVcs.AIRecentJun 1, 2026

Fast and Lightweight Novel View Synthesis with Differentiable Multiplane Image

Kaidi Zhang, Guanxu Zhu

The paper proposes a fast and lightweight novel view synthesis method using a differentiable Multiplane Image (MPI) representation, achieving significant speed and size improvements over state-of-the-…

View →

cs.CVcs.AIRecentMay 28, 2026

Versatile Framework with Semantic and Structural guidance for Image Reconstruction from Brain Activity

Yizhuo Lu, Changde Du, Qiongyi Zhou, Liuyun Jiang +1 more

The paper proposes MindDiffuser, a two-stage framework that significantly improves image reconstruction from brain activity by combining semantic guidance from text-to-image models with structural ref…

View →

cs.CVcs.GRRecentJun 1, 2026

Ultra Diffusion Poser: Diffusion-Based Human Motion Tracking From Sparse Inertial Sensors and Ranging-Based Between-Sensor Distances

Dominik Hollidt, Tommaso Bendinelli, Christian Holz

Ultra Diffusion Poser is a novel diffusion model that improves human motion tracking from sparse IMUs and UWB ranging by explicitly modeling the geometric constraints imposed by inter-sensor distances…

View →

cs.CVRecentJun 1, 2026

LL-Bench: Rethinking Low-Level Vision Evaluation in the Era of Large-Scale Generative Models

Lu Liu, Huiyu Duan, Chenxin Zhu, Jintong Lu +5 more

The paper introduces LL-Bench, a comprehensive benchmark for evaluating large-scale generative models on low-level vision tasks, and proposes LL-Score, an MLLM-based evaluator that better aligns quali…

View →

cs.CVcs.CRcs.LGRecentApr 14, 2026

PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction

Prajas Wadekar, Venkata Sai Pranav Bachina, Kunal Bhosikar, Ankit Gangwal +1 more

PatchPoison introduces a lightweight dataset-poisoning method that injects small, high-frequency adversarial patches into multi-view image datasets to systematically corrupt feature matching and degra…

View →

cs.AIcs.LGRecentMay 29, 2026

TIGER: Traceable Inference with Graph-Based Evidence Routing for Mitigating Hallucinations in Multimodal Generation

Kaixiang Zhao, Tianrun Yu, Shawn Huang, Porter Jenkins +2 more

TIGER is an inference-time framework that uses graph-based evidence routing to independently assess and repair unsupported facts (hallucinations) in multimodal generation.

View →

cs.CRRecentApr 2, 2026

Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations

Yue Li, Linying Xue, Kaiqing Lin, Hanyu Quan +4 more

The paper proposes AEGIS, a novel diffusion-guided method for injecting adversarial perturbations into the latent space to create generalizable and robust defenses against advanced facial deepfake man…

View →