Papers similar to 2606.01899

~ similar to 2606.01899· 20 results

cs.CVcs.RORecentJun 3, 2026

CIPER: A Unified Framework for Cross-view Image-retrieval and Pose-estimation

Yurim Jeon, Dongseong Seo, Seung-Woo Seo

CIPER proposes a unified transformer framework to simultaneously perform cross-view image retrieval and precise 3-DoF pose estimation, overcoming the limitations of cascaded, separate methods.

View →

cs.ROcs.AIRecentMay 28, 2026

V2I Work Zone Geometry Reconstruction with Pose-Conditioned UWB Range Denoising

Jiaxi Liu, Hangyu Li, Yang Cheng, Rui Gana +6 more

The paper proposes a pose-conditioned, permutation-equivariant denoiser to accurately reconstruct work zone geometry using noisy Ultra-Wideband (UWB) range data from connected and autonomous vehicles…

View →

eess.SPcs.CRcs.LGRecentApr 14, 2026

Rapid LoRA Aggregation for Wireless Channel Adaptation in Open-Set Radio Frequency Fingerprinting

Mingxi Zhang, Renjie Xie, Jincheng Wang, Guyue Li +1 more

The paper proposes a lightweight, self-adaptive framework using LoRA to efficiently extract and aggregate radio frequency fingerprints for robust open-set authentication in dynamic wireless environmen…

View →

eess.SPcs.AIRecentMay 29, 2026

DRIFT: Joint Channel Estimation and Prediction Towards Pilotless 6G Non-Terrestrial Networks

Bruno De Filippo, Carla Amatetti, Alessandro Vanelli-Coralli

The paper proposes DRIFT, a lightweight joint channel estimation and prediction framework, to significantly reduce pilot overhead and boost spectral efficiency in power-constrained LEO Non-Terrestrial…

View →

cs.ROcs.CRRecentMay 13, 2026

Uncertainty-Aware 3D Position Refinement for Multi-UAV Systems

Hosam Alamleh, Damir Pulatov

The paper proposes an uncertainty-aware, decentralized fusion layer for multi-UAV systems that significantly improves 3D localization robustness by incorporating neighbor constraints and handling faul…

View →

cs.CVcs.AIcs.LGRecentMay 29, 2026

FOCUS: Forcing In-Context Object Localization through Visual Support Constraints and Policy Optimization

Mohammed Asad Karim, Vinay Kumar Verma

The paper introduces a novel two-stage framework to achieve robust, category-agnostic object localization in-context (ICL) by optimizing attention and minimizing localization error using reinforcement…

View →

cs.CVcs.AIRecentMay 27, 2026

SSR3D-LLM: Structured Spatial Reasoning via Latent Steps for Fine-Grained Grounding in Unified 3D-LLMs

Jiawei Li, Ziyi Liu, Weijie Shi, Long Chen +2 more

SSR3D-LLM introduces a structured spatial reasoning interface for unified 3D-LLMs, allowing fine-grained object grounding by generating and processing sequential latent spatial steps.

View →

cs.CVcs.AIcs.LGRecentMay 30, 2026

MoEIoU: Rethinking Bounding-Box Regression as a Mixture of Experts

Vinay Edula, Priyanka Bagade

The paper proposes MoEIoU, a novel mixture-of-experts based regression loss that adaptively models bounding-box localization errors, achieving superior convergence and accuracy in object detection.

View →

cs.CVcs.AIRecentMay 27, 2026

FLORO: A Multimodal Geospatial Foundation Model for Ecological Remote Sensing Across Sensors and Scales

Jorge L. Rodriguez, Victor Angulo Morales, Areej Alwahas, Mariana Elias Lara +5 more

FLORO is a multimodal geospatial foundation model that learns transferable remote sensing representations from a small, diverse corpus, achieving strong performance across various sensor types and res…

View →

eess.SPcs.AIcs.NIRecentMay 31, 2026

A Communication-Centric 6G-LLM Architecture for Scalable Tactical Autonomous Defense Vehicle Networks

Kiran Khurshid, Shumaila Javaid, Nasir Saeed

The paper proposes a communication-centric 6G-LLM architecture for tactical autonomous defense vehicles, demonstrating significant improvements in coordination and communication efficiency over conven…

View →

cs.CRRecentMar 31, 2026

Client-Verifiable and Efficient Federated Unlearning in Low-Altitude Wireless Networks

Yuhua Xu, Mingtao Jiang, Chenfei Hu, Yinglong Wang +4 more

The paper proposes VerFU, a client-verifiable federated unlearning framework for low-altitude wireless networks that allows devices to ensure the server accurately removes their historical data contri…

View →

cs.CVcs.AIRecentMay 28, 2026

VLM3: Vision Language Models Are Native 3D Learners

Zhipeng Cai, Zhuang Liu, Yunyang Xiong, Zechun Liu +2 more

The paper proposes VLM3, a simple, scalable method that demonstrates standard Vision Language Models (VLMs) can natively learn 3D understanding by focusing on architectural simplicity and specific dat…

View →

cs.CReess.SPRecentMay 14, 2026

Model Forensics in AI-Native Wireless Networks: Taxonomy, Applications, and Case Study

Pengyu Chen, Weiyang Li, Jin Xu, Jiacheng Wang +3 more

This paper surveys model forensics in AI-native wireless networks, detailing key security problems and demonstrating practical workflows for verifying model authenticity and detecting malicious functi…

View →

cs.CVcs.AIRecentMay 27, 2026

ROVER: Routing Object-Centric Visual Evidence for Grounded Multi-Image Reasoning

Guannan Lv, Ren Nie, Hongjian Dou, Tingting Gao

ROVER is a lightweight, learnable plugin that efficiently routes and integrates object-centric visual evidence across multiple images and objects, significantly improving performance on grounded multi…

View →

cs.AIRecentMay 30, 2026

PropLLM: Propagation-Aware Scene Reconstruction for Network Fault Diagnosis

Zongzong Wu, Ming Zhao, Fengxiao Tang, Nei Kato

PropLLM introduces a novel propagation-aware framework that uses LLMs and hop-by-hop scene reconstruction to accurately localize root causes and determine fault types in complex network fault diagnosi…

View →

cs.IREmpiricalRecentJun 10, 2026

Tail-Aware Adaptive-k: Query-Adaptive Context Selection for Retrieval-Augmented Generation

Ziyu Song, Jiaming Fang, Kuangyu Li, Tuo Xia +1 more

This paper proposes Tail-Aware Adaptive-k (TAA-k), a training-free framework for adaptive context selection in retrieval-augmented generation systems using Extreme Value Theory.

View →

cs.CVcs.GRRecentJun 1, 2026

Ultra Diffusion Poser: Diffusion-Based Human Motion Tracking From Sparse Inertial Sensors and Ranging-Based Between-Sensor Distances

Dominik Hollidt, Tommaso Bendinelli, Christian Holz

Ultra Diffusion Poser is a novel diffusion model that improves human motion tracking from sparse IMUs and UWB ranging by explicitly modeling the geometric constraints imposed by inter-sensor distances…

View →

cs.CVRecentJun 1, 2026

Training-Free Composed Video Retrieval via Visual Representation-Guided Video-LLM Reasoning

Yang Liu, Qianqian Xu, Peisong Wen, Siran Dai +1 more

The paper proposes a training-free framework, Visual Representation-Guided Video-LLM Reasoning, to perform composed video retrieval by using visual examples and text instructions, achieving strong per…

View →

cs.CVcs.AIRecentJun 1, 2026

MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence

Hilton Raj, Vishnuram AV

MASER is a lightweight framework that dynamically routes a shared Vision-Language Model (VLM) to the most appropriate modality-specific adapter (e.g., point cloud, RGB) based on the input question, si…

View →

cs.LGcs.AIRecentMay 27, 2026

Locality-Aware Redundancy Pruning for LLM Depth Compression

Vincent-Daniel Yun, Youngrae Kim, Woosang Lim, YoungJin Heo +2 more

The paper proposes Locality-Aware Redundancy Pruning (LoRP), a training-free method that prunes LLM layers by exploiting localized inter-layer redundancy, leading to improved efficiency while maintain…

View →