~ similar to 2605.28441· 19 results
Haolin Deng, Xin Zou, Zhiwei Jin, Chen Chen +2 more
The paper proposes In-Context Visual Contrastive Optimization (IC-VCO) to rigorously mitigate multimodal hallucinations in Vision-Language Models by optimizing contrastive learning within a shared mul…
Hee Suk Yoon, Eunseop Yoon, Jaehyun Jang, SooHwan Eom +5 more
The paper proposes Visual Gradient Steering (VGS), a method that decomposes the distillation loss into language and visual components and steers the optimization to prioritize visual grounding, signif…
The paper introduces CERA, a novel contrastive retrieval framework that improves RAG factuality and interpretability by using subjectivity-based hard negative selection and an auxiliary attention alig…
The paper proposes BRACS, a training-free steering framework that adaptively corrects visual grounding failures in large vision-language models, significantly reducing object hallucination without sac…
The paper proposes AlignG, a method that learns context-conditioned predicate semantics by using prototype feedback to adapt relation representations based on image-specific evidence, significantly im…
Melihcan Erol, Suat Evren, Oktay Ozel, Alexander Morgan +2 more
The paper proposes WEINCE, a modified InfoNCE objective that uses extreme value theory corrections to improve contrastive learning by more accurately modeling the selection of hard negative examples.
Kecen Li, Chen Gong, Zinan Lin, Tianhao Wang +1 more
The paper proposes DP-GCL, a novel differentially private contrastive learning framework that improves representation learning on sensitive data by bounding gradient dependency through localized group…
The paper introduces Responsible Contrastive Soft Prompting (RCSP), a parameter-efficient method using soft prompts to improve LLM reliability by simultaneously suppressing hallucinations, encouraging…
This paper demonstrates that Concept Bottleneck Models (CBMs), despite their interpretability, are highly vulnerable to targeted adversarial attacks that manipulate semantic concepts, and proposes SPE…
The paper proposes an objective-wise reputation-market mechanism to dynamically calibrate and gate LLM-generated expert priors in multi-objective Bayesian optimization, showing that dynamic calibratio…
The paper proposes a disentangled representation framework to significantly improve few-shot layout-to-image generation by separating semantic identity from local visual details, thereby mitigating re…
The paper introduces Contrastive Reflection (CORE), a novel non-parametric method that rapidly improves language model reasoning by distilling contrasts between successful and unsuccessful problem att…
Bo Wang, Jia Ni, Mengnan Zhao, Zhan Qin +1 more
This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Sem…
The paper proposes a novel Disentanglement-based Equivariant Learning (DEAL) framework that enhances compositional VQA by disentangling concepts and enforcing equivariant constraints, achieving state-…
WenZhang Wei, Zhipeng Gui, Dehua Peng, Tiandi Ye +1 more
The paper proposes a Variational Adapter (VACSR) to improve cross-modal similarity representation by treating fine-grained image-text matching as a variational inference problem, thereby mitigating th…
Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more
The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…
CORE-MTL proposes a representation-centric framework that uses causal orthogonal representations to disentangle task-relevant structure from nuisance variation in multi-task learning, achieving superi…
Qiaoru Li, Shaotian Liang, Jintao Chen, Haoran Sun +3 more
VITAL introduces a novel latent-space reasoning framework for medical MLLMs, utilizing visual-semantic dual supervision to enhance reasoning capabilities and provide crucial interpretability without s…
The study demonstrates that robust, domain-invariant representations of synthetic deception can be rapidly entrenched in LLMs using modest fine-tuning, detectable by linear probes even in early layers…