Papers similar to 2606.02022

~ similar to 2606.02022· 19 results

cs.CVcs.AIRecentMay 29, 2026

Redefining Instance Matching: A Unified Framework for Part-Aware Matching in Panoptic Segmentation Evaluation

Erik Großkopf, Soumya Snigdha Kundu, Hendrik Möller, Nicolas Münster +8 more

The paper proposes a unified framework to systematically redefine instance matching for Panoptic Quality evaluation, moving beyond the standard One-to-One matching to accommodate complex scenarios lik…

View →

cs.AIcs.DBcs.IRRecentMay 29, 2026

Vector Linking via Cross-Model Local Isometric Consistency

Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more

The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…

View →

cs.CVRecentJun 1, 2026

Honey, I Shrunk the Arc de Triomphe!

Yuanbo Xiangli, Hanyu Chen, Xueqing Tsang, Noah Snavely

The paper introduces MetricScenes, a new large-scale, in-the-wild dataset, and demonstrates that fine-tuning existing geometry models on this dataset significantly mitigates the scale-collapse problem…

View →

cs.CVRecentJun 1, 2026

LL-Bench: Rethinking Low-Level Vision Evaluation in the Era of Large-Scale Generative Models

Lu Liu, Huiyu Duan, Chenxin Zhu, Jintong Lu +5 more

The paper introduces LL-Bench, a comprehensive benchmark for evaluating large-scale generative models on low-level vision tasks, and proposes LL-Score, an MLLM-based evaluator that better aligns quali…

View →

cs.LGcs.CVRecentJun 1, 2026

Closing the Alignment-Maturity Gap in Federated Prototype Learning

Mario Casado-Diez, Alejandro Dopico-Castro, Verónica Bolón-Canedo, Bertha Guijarro-Berdiñas

The paper proposes FedSAP, a framework that stabilizes federated prototype learning by delaying global alignment and enforcing inter-class structure, significantly improving representation quality und…

View →

cs.CVcs.AIcs.LGRecentJun 2, 2026

Formalizing the Binding Problem

Lianghuan Huang, Yihao Li, Saeed Salehi, Yingshan Chang +2 more

This paper formalizes the binding problem using information theory and develops a probing method to measure binding information in deep learning representations, demonstrating that binding is crucial…

View →

cs.LGcs.AIstat.MLRecentMay 28, 2026

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

Eugène Berta, David Holzmüller, Francis Bach, Michael I. Jordan

The paper introduces CalArena, a large-scale, standardized benchmark covering nearly 2000 experiments to comprehensively evaluate post-hoc calibration methods, finding that smooth calibration function…

View →

cs.CVcs.AIcs.LGRecentMay 28, 2026

Learning Context-Conditioned Predicate Semantics via Prototype Feedback

NamGyu Jung, Chang Choi

The paper proposes AlignG, a method that learns context-conditioned predicate semantics by using prototype feedback to adapt relation representations based on image-specific evidence, significantly im…

View →

cs.CVRecentJun 1, 2026

Multi-modal Video Representation Alignment for Robust Self-supervised Driver Distraction Detection

David J. Lerch, Livien Majer, Zeyun Zhong, Manuel Martin +2 more

The paper proposes a novel global multi-modal alignment framework to robustly learn video representations from noisy and complementary sensor data, significantly improving driver distraction detection…

View →

cs.CVcs.AIRecentMay 30, 2026

Benchmarks for Vision-Language Models in Urban Perception Should Be Reliability-Aware and Negotiated

Rashid Mushkani

The paper argues that benchmarking Vision-Language Models (VLMs) for urban perception must treat human disagreement and non-response as key measurement outcomes, rather than assuming perfect consensus…

View →

cs.CVcs.AIcs.LGRecentJun 1, 2026

Understanding Identity Continuity in Thermal Video through Scene-Level Consistency

Wei-Chieh Sun, Gyungmin Ko, Heejae Kwon, Hsiang-Wei Huang +1 more

The paper proposes a lightweight post-processing framework that enhances identity continuity in thermal pedestrian tracking by leveraging scene-level spatial-temporal consistency, achieving improved t…

View →

cs.CVcs.AIcs.LGRecentMay 30, 2026

MoEIoU: Rethinking Bounding-Box Regression as a Mixture of Experts

Vinay Edula, Priyanka Bagade

The paper proposes MoEIoU, a novel mixture-of-experts based regression loss that adaptively models bounding-box localization errors, achieving superior convergence and accuracy in object detection.

View →

cs.CVcs.RORecentJun 3, 2026

CIPER: A Unified Framework for Cross-view Image-retrieval and Pose-estimation

Yurim Jeon, Dongseong Seo, Seung-Woo Seo

CIPER proposes a unified transformer framework to simultaneously perform cross-view image retrieval and precise 3-DoF pose estimation, overcoming the limitations of cascaded, separate methods.

View →

cs.CVcs.AIRecentMay 28, 2026

GiPL: Generative augmented iterative Pseudo-Labeling for Cross-Domain Few-Shot Object Detection

Jiacong Liu, Shu Luo, Yikai Qin, Yaze Zhao +2 more

GiPL proposes a novel two-branch framework combining iterative pseudo-label self-training and generative data augmentation to significantly improve Cross-Domain Few-Shot Object Detection by better uti…

View →

cs.LGcs.AIRecentMay 28, 2026

KLAS: Using Similarity to Stitch Neural Networks for Improved Accuracy-Efficiency Tradeoffs

Debopam Sanyal, Anantharaman Iyer, Alind Khare, Trisha Jain +4 more

KLAS introduces a novel framework that uses KL divergence to automatically select optimal pairs of pretrained models for stitching, significantly improving the accuracy-efficiency tradeoff of resultin…

View →

cs.CRcs.CVRecentMay 28, 2026

AdvScene: Rethinking Adversarial Patch Evaluation Through Scene Robustness

Xiaoyong, Yuan, Lan, Zhang

The paper introduces AdvScene, a novel scene-grounded framework that measures the real-world 'scene robustness' of adversarial patches by characterizing their operational envelope across varying viewp…

View →

cs.CVcs.AIcs.RORecentMay 28, 2026

PInVerify: An Offline Embodied Benchmark for Active Instance Verification

Yuhang Jiang

The paper introduces PInVerify, an offline embodied benchmark for Active Instance Verification (AIV), a task requiring agents to actively select viewpoints to confirm if a candidate object matches a f…

View →

cs.CVcs.AIRecentMay 27, 2026

ROVER: Routing Object-Centric Visual Evidence for Grounded Multi-Image Reasoning

Guannan Lv, Ren Nie, Hongjian Dou, Tingting Gao

ROVER is a lightweight, learnable plugin that efficiently routes and integrates object-centric visual evidence across multiple images and objects, significantly improving performance on grounded multi…

View →

cs.CVRecentJun 1, 2026

TROPHIES: Temporal Reconstruction of Places, Humans, and Cameras from Multi-view Videos

Jinpeng Liu, Yukang Xu, Yutong Li, Xingyu Liu

TROPHIES introduces a unified framework to jointly reconstruct dynamic humans, static scenes, and camera poses from multi-view videos, achieving globally consistent and physically plausible 4D reconst…

View →