~ similar to 2605.28174· 18 results
Steffen Knoblauch, Hao Li, Gengchen Mai, Konstantin Klemmer +2 more
The paper advocates for a paradigm shift toward joint Spatial Representation Learning (SRL) that unifies raster imagery and structured vector data into a single embedding space for developing more sem…
The paper introduces CAFOSat, a large-scale, strongly annotated, and infrastructure-aware dataset designed to improve the accuracy of mapping Concentrated Animal Feeding Operations (CAFOs) from high-r…
LALE introduces a novel lightweight architecture that efficiently combines local convolutional features and global transformer context for land-cover segmentation, achieving superior efficiency and pe…
The paper introduces a knowledge distillation framework to adapt a dead tree detection model trained on one geographical area (Finland) to multiple diverse forest types (Poland, Germany, Estonia), ach…
CIPER proposes a unified transformer framework to simultaneously perform cross-view image retrieval and precise 3-DoF pose estimation, overcoming the limitations of cascaded, separate methods.
This paper introduces a novel cloud-removal framework using Denoising Diffusion Probabilistic Models and a Masked Diffusion Transformer to generate cloud-free multispectral flood imagery, significantl…
Adrián Cánovas-Rodriguez, Miguel A. González-Illán, Maria Fernanda García-Cruz, Pedro Nortes Tortosa +4 more
The paper proposes an attention-enhanced deep learning framework using EfficientNet and CBAM to achieve high accuracy (93.3%) in classifying peach leaf damage, demonstrating improved robustness under…
Places in the Wild introduces a massive, high-resolution RAW photograph dataset of 67,574 images captured in situ across 810 locations, providing unprecedented detail for ecologically valid vision res…
The paper introduces MetricScenes, a new large-scale, in-the-wild dataset, and demonstrates that fine-tuning existing geometry models on this dataset significantly mitigates the scale-collapse problem…
The paper proposes xModel-KD, a cross-modal knowledge distillation framework, to improve 3D point cloud segmentation by effectively transferring rich appearance cues from 2D images to sparse 3D geomet…
Jie Gao, Jie Ma, Kaihui Lin, Kai Ye +3 more
The paper introduces SkyShield, the first front-view monocular semantic occupancy benchmark for low-altitude urban UAV flight, along with a novel metric and model to address the unique safety challeng…
The paper introduces SPARROW, an autonomous, open-source platform that uses solar power, edge AI, and satellite communication to enable continuous, scalable biodiversity monitoring in remote global ec…
The paper proposes FedSAP, a framework that stabilizes federated prototype learning by delaying global alignment and enforcing inter-class structure, significantly improving representation quality und…
DarkVesselNet is a novel multi-modal deep learning framework that fuses SAR, optical, and AIS data to accurately detect vessels that do not report their presence via Automatic Identification System (A…
This paper proposes a novel volumetric change detection sub-task for monitoring slope instabilities using time-lapse cameras, demonstrating that dense and semi-dense feature matching techniques are ro…
Yuming Zhao, Junhui Hou, Qijian Zhang, Jia Qin +1 more
The paper introduces PRISM, a novel representation learning framework that learns isometric embeddings by explicitly modeling the intrinsic geodesic metric of 3D surfaces, achieving superior performan…
This paper evaluates the physical transfer of adversarial patches against aerial vehicle detectors, finding that while digitally optimized patches can be highly effective, their real-world robustness…
GeoSAM-3D proposes a novel framework for open-vocabulary 3D scene segmentation from simple monocular video by propagating object prompts using a geodesic distance kernel on a reconstructed Gaussian sc…