Papers similar to 2605.28100

~ similar to 2605.28100· 19 results

cs.CVcs.AIRecentMay 27, 2026

FLORO: A Multimodal Geospatial Foundation Model for Ecological Remote Sensing Across Sensors and Scales

Jorge L. Rodriguez, Victor Angulo Morales, Areej Alwahas, Mariana Elias Lara +5 more

FLORO is a multimodal geospatial foundation model that learns transferable remote sensing representations from a small, diverse corpus, achieving strong performance across various sensor types and res…

View →

cs.CVcs.AIcs.LGRecentMay 30, 2026

CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery

Oishee Bintey Hoque, Nibir Chandra Mandal, Mandy L Wilson, Samarth Swarup +2 more

The paper introduces CAFOSat, a large-scale, strongly annotated, and infrastructure-aware dataset designed to improve the accuracy of mapping Concentrated Animal Feeding Operations (CAFOs) from high-r…

View →

cs.CVcs.LGRecentJun 1, 2026

Deep Learning for Remote Sensing to Improve Flood Inundation Mapping

Yogesh Bhattarai, Vijay Chaudhary, Wai Lim Kim, Sanjib Sharma

This paper introduces a novel cloud-removal framework using Denoising Diffusion Probabilistic Models and a Masked Diffusion Transformer to generate cloud-free multispectral flood imagery, significantl…

View →

cs.CVRecentJun 1, 2026

Honey, I Shrunk the Arc de Triomphe!

Yuanbo Xiangli, Hanyu Chen, Xueqing Tsang, Noah Snavely

The paper introduces MetricScenes, a new large-scale, in-the-wild dataset, and demonstrates that fine-tuning existing geometry models on this dataset significantly mitigates the scale-collapse problem…

View →

cs.CVRecentJun 1, 2026

Cross-Domain Dead Tree Detection via Knowledge Distillation in Aerial Imagery

Anis Ur Rahman, Mete Ahishali, Einari Heinaro, Samuli Junttila

The paper introduces a knowledge distillation framework to adapt a dead tree detection model trained on one geographical area (Finland) to multiple diverse forest types (Poland, Germany, Estonia), ach…

View →

cs.ROcs.AIcs.CVRecentMay 31, 2026

DeepIPCv3: Event-Aware Multi-Modal Sensor Fusion for Sudden Pedestrian Crossing Avoidance

Oskar Natan, Andi Dharmawan, Aufaclav Zatu Kusuma Frisky, Jazi Eko Istiyanto +1 more

DeepIPCv3 is a novel multi-modal framework that fuses LiDAR and DVS event streams using cross-modal attention to achieve state-of-the-art, highly reactive avoidance maneuvers for sudden pedestrian cro…

View →

cs.CLcs.AIcs.CVRecentJun 1, 2026

PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

Yusong Zhao, Yuejin Xie, Youliang Yuan, Junjie Hu +3 more

The paper introduces PaSBench-Video, a comprehensive streaming video benchmark designed to rigorously test multimodal LLMs' ability to issue proactive safety warnings, finding that current models stru…

View →

cs.AIRecentJun 1, 2026

Spatial Representation Learning Beyond Pixels: Unifying Raster Data and Vector Semantics for Human-Centric Geospatial Foundation Models

Steffen Knoblauch, Hao Li, Gengchen Mai, Konstantin Klemmer +2 more

The paper advocates for a paradigm shift toward joint Spatial Representation Learning (SRL) that unifies raster imagery and structured vector data into a single embedding space for developing more sem…

View →

cs.CVcs.AIRecentJun 1, 2026

Attention mechanisms and transfer learning for robust peach leaf damage classification under domain shift

Adrián Cánovas-Rodriguez, Miguel A. González-Illán, Maria Fernanda García-Cruz, Pedro Nortes Tortosa +4 more

The paper proposes an attention-enhanced deep learning framework using EfficientNet and CBAM to achieve high accuracy (93.3%) in classifying peach leaf damage, demonstrating improved robustness under…

View →

cs.CVcs.AIRecentJun 1, 2026

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

Xiaolin Liu, Yilun Zhu, Xiangyu Zhao, Xuehui Wang +8 more

The paper introduces Moment-Video, a new benchmark that diagnoses the ability of video MLLMs to understand brief, critical visual events, revealing that current models struggle significantly with temp…

View →

cs.CVcs.AIcs.LGRecentJun 1, 2026

Ranking vs. Assignment: The Metric Mismatch in Multi-View Object Association

Matvei Shelukhan, Timur Mamedov, Aleksandr Chukhrov, Karina Kvanchiani

The paper identifies a fundamental mismatch between standard pairwise ranking metrics (like AP and FPR-95) and the true assignment objective in multi-view object association, proposing a Sinkhorn-base…

View →

cs.CRRecentApr 23, 2026

Cross-Modal Phantom: Coordinated Camera-LiDAR Spoofing Against Multi-Sensor Fusion in Autonomous Vehicles

Shahriar Rahman Khan, Raiful Hasan

The paper demonstrates a coordinated, cross-modal spoofing attack that successfully deceives state-of-the-art multi-sensor fusion systems in autonomous vehicles by making multiple sensors agree on a f…

View →

cs.CRcs.ETcs.LGRecentApr 30, 2026

Selfie-Capture Dynamics as an Auxiliary Signal Against Deepfakes and Injection Attacks for Mobile Identity Verification

Erkka Rantahalvari, Olli Silvén, Zinelabidine Boulkenafet, Constantino Álvarez Casado

The paper demonstrates that passive motion traces recorded during a mobile selfie capture can serve as a measurable, low-friction auxiliary signal for enhancing both spoof screening and user identity…

View →

cs.CRcs.LGRecentMay 19, 2026

Latent Geometry as a Structural Monitor: Eigenspace Alignment for Anomaly Detection in Anonymity Networks

Vaibhav Chhabra

The paper proposes using geometric metrics, specifically eigenspace alignment, to monitor the structural integrity of large behavioral populations, demonstrating its effectiveness in detecting network…

View →

cs.CVRecentJun 1, 2026

MORPHOS: Autoregressive 4D Generation with Temporal Structured Latents

Minkyung Kwon, Jinhyeok Choi, Youngjin Shin, Jaeyeong Kim +2 more

MORPHOS is a novel autoregressive framework that generates dynamic 3D assets (like meshes and radiance fields) from videos by using a unified 4D representation to ensure temporal consistency and handl…

View →

cs.CVRecentJun 1, 2026

Explainable Forensics of Manipulated Segments in Untrimmed Long Videos

Yue Feng, Jingjing Li, Qijia Lu, Wei Ji +8 more

This paper addresses the challenge of detecting and explaining AI-manipulated segments within long, untrimmed videos by proposing a new benchmark and a coarse-to-fine forensic detection framework.

View →

cs.LGastro-ph.IMcs.CVRecentJun 3, 2026

Identifying Gems from Roman RAPIDly

Karan Gandhi, Ashish A. Mahabal, Jacob E. Jencson, Russ R. Laher +5 more

This paper introduces a machine learning model, RuBR, and a methodology to reliably distinguish genuine astronomical transients from spurious detections for the upcoming Roman Space Telescope's data p…

View →

cs.CVcs.AIcs.LGRecentJun 1, 2026

Understanding Identity Continuity in Thermal Video through Scene-Level Consistency

Wei-Chieh Sun, Gyungmin Ko, Heejae Kwon, Hsiang-Wei Huang +1 more

The paper proposes a lightweight post-processing framework that enhances identity continuity in thermal pedestrian tracking by leveraging scene-level spatial-temporal consistency, achieving improved t…

View →

cs.CVcs.CRRecentMay 17, 2026

Deepfake Detection in Social Media: A Temporal Artifact Analysis Using 3D Convolutional Neural Networks

Mohammadreza Rashidi, Raja Hashim Ali, Sami Ur Rahman

This paper proposes a 3D CNN detector that leverages temporal artifacts to accurately identify high-quality deepfake videos, demonstrating robust detection even after social media re-encoding.

View →