Papers similar to 2606.00844

~ similar to 2606.00844· 19 results

cs.CVcs.AIcs.LGRecentMay 29, 2026

FOCUS: Forcing In-Context Object Localization through Visual Support Constraints and Policy Optimization

The paper introduces a novel two-stage framework to achieve robust, category-agnostic object localization in-context (ICL) by optimizing attention and minimizing localization error using reinforcement…

View →

cs.CVcs.RORecentJun 3, 2026

CIPER: A Unified Framework for Cross-view Image-retrieval and Pose-estimation

Yurim Jeon, Dongseong Seo, Seung-Woo Seo

CIPER proposes a unified transformer framework to simultaneously perform cross-view image retrieval and precise 3-DoF pose estimation, overcoming the limitations of cascaded, separate methods.

View →

cs.CVcs.AIRecentMay 29, 2026

Redefining Instance Matching: A Unified Framework for Part-Aware Matching in Panoptic Segmentation Evaluation

Erik Großkopf, Soumya Snigdha Kundu, Hendrik Möller, Nicolas Münster +8 more

The paper proposes a unified framework to systematically redefine instance matching for Panoptic Quality evaluation, moving beyond the standard One-to-One matching to accommodate complex scenarios lik…

View →

cs.CVcs.AIcs.LGRecentJun 1, 2026

Ranking vs. Assignment: The Metric Mismatch in Multi-View Object Association

Matvei Shelukhan, Timur Mamedov, Aleksandr Chukhrov, Karina Kvanchiani

The paper identifies a fundamental mismatch between standard pairwise ranking metrics (like AP and FPR-95) and the true assignment objective in multi-view object association, proposing a Sinkhorn-base…

View →

cs.ROcs.AIRecentMay 28, 2026

V2I Work Zone Geometry Reconstruction with Pose-Conditioned UWB Range Denoising

Jiaxi Liu, Hangyu Li, Yang Cheng, Rui Gana +6 more

The paper proposes a pose-conditioned, permutation-equivariant denoiser to accurately reconstruct work zone geometry using noisy Ultra-Wideband (UWB) range data from connected and autonomous vehicles…

View →

cs.CVRecentJun 1, 2026

Symmetry-Aware 9D Pose Estimation with Sim(3)-Consistent Feature and Spherical Inception Convolution

Panfei Cheng, Hongshan Yu, Wenrui Chen, Xiaojun Tang +2 more

The paper proposes a novel symmetry-aware, category-level method for 9D object pose estimation that accurately estimates translation and size first, followed by rotation, achieving state-of-the-art re…

View →

cs.CVcs.AIRecentMay 28, 2026

GiPL: Generative augmented iterative Pseudo-Labeling for Cross-Domain Few-Shot Object Detection

Jiacong Liu, Shu Luo, Yikai Qin, Yaze Zhao +2 more

GiPL proposes a novel two-branch framework combining iterative pseudo-label self-training and generative data augmentation to significantly improve Cross-Domain Few-Shot Object Detection by better uti…

View →

cs.LGcs.CVRecentJun 1, 2026

Closing the Alignment-Maturity Gap in Federated Prototype Learning

Mario Casado-Diez, Alejandro Dopico-Castro, Verónica Bolón-Canedo, Bertha Guijarro-Berdiñas

The paper proposes FedSAP, a framework that stabilizes federated prototype learning by delaying global alignment and enforcing inter-class structure, significantly improving representation quality und…

View →

cs.CVcs.AIRecentMay 28, 2026

xModel-KD: Cross-modal Knowledge Distillation for 3D Scene Perception using LiDAR

Thenukan Pathmanathan, Kanchan Keisham, Thangarajah Akilan

The paper proposes xModel-KD, a cross-modal knowledge distillation framework, to improve 3D point cloud segmentation by effectively transferring rich appearance cues from 2D images to sparse 3D geomet…

View →

cs.CVcs.AIcs.RORecentMay 28, 2026

Energy-Aware NECO for Single-Pass Pixel-wise Out-of-Distribution Detection in Semantic Segmentation

Boyuan Zhang, Huanshan Huang, Yifei Cao

The paper proposes Energy-Aware NECO, a single-pass hybrid detector that combines geometric ratio and logit-based energy scores to achieve superior pixel-wise out-of-distribution detection for semanti…

View →

cs.CVcs.CGRecentMay 28, 2026

S2MDF: A Plug-And-Play Layer for Intersection-Free Multi-Object Signed Distance Fields

Deniz Sayin Mercadier, Federico Stella, Aurel Bizeau, Nicolas Talabot +1 more

The paper introduces S2MDF, a plug-and-play module that enforces a hard constraint to eliminate interpenetrations in multi-object Signed Distance Field (SDF) representations, significantly improving p…

View →

cs.AIcs.DBcs.IRRecentMay 29, 2026

Vector Linking via Cross-Model Local Isometric Consistency

Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more

The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…

View →

cs.CVcs.RORecentJun 2, 2026

Exploring Easy Boosts for Lidar Semantic Scene Completion

Tetiana Martyniuk, Jonathan Seele, Alexandre Boulch, Gilles Puy +2 more

The paper shows that simple, non-architectural enhancements, such as adding semantic pseudo-labels and visibility information, can significantly boost Lidar Semantic Scene Completion performance.

View →

cs.CVcs.AIcs.LGRecentMay 28, 2026

Learning Context-Conditioned Predicate Semantics via Prototype Feedback

NamGyu Jung, Chang Choi

The paper proposes AlignG, a method that learns context-conditioned predicate semantics by using prototype feedback to adapt relation representations based on image-specific evidence, significantly im…

View →

cs.CVcs.LGeess.IVRecentJun 3, 2026

An Open-Source Two-Stage Computer Vision Pipeline for Fine-Grained Vehicle Classification using Vision Transformers

Gandhimathi Padmanaban, Fred Feng

This paper presents an open-source computer vision pipeline for classifying vehicle body types from naturalistic roadway video.

View →

cs.CRRecentApr 23, 2026

Cross-Modal Phantom: Coordinated Camera-LiDAR Spoofing Against Multi-Sensor Fusion in Autonomous Vehicles

Shahriar Rahman Khan, Raiful Hasan

The paper demonstrates a coordinated, cross-modal spoofing attack that successfully deceives state-of-the-art multi-sensor fusion systems in autonomous vehicles by making multiple sensors agree on a f…

View →

cs.LGcs.AIstat.MLRecentMay 28, 2026

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

Eugène Berta, David Holzmüller, Francis Bach, Michael I. Jordan

The paper introduces CalArena, a large-scale, standardized benchmark covering nearly 2000 experiments to comprehensively evaluate post-hoc calibration methods, finding that smooth calibration function…

View →

cs.CVcs.AIRecentMay 28, 2026

Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning

Chun-Hsiao Yeh, Shengyi Qian, Manchen Wang, Yi Ma +2 more

The paper proposes GASP, a framework that injects fundamental geometric priors directly into Vision-Language Models (VLMs) using ground-truth video geometry, significantly enhancing 3D spatial reasoni…

View →

cs.CVcs.AIRecentJun 1, 2026

Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks

Nermeen Abou Baker, David Rohrschneider, Uwe Handmann

This paper investigates the application of Parameter-Efficient Fine-Tuning (PEFT) methods, specifically adapters and LoRA, to large pretrained models for instance segmentation, demonstrating that thes…

View →