~ similar to 2606.02310· 19 results
FLORO is a multimodal geospatial foundation model that learns transferable remote sensing representations from a small, diverse corpus, achieving strong performance across various sensor types and res…
Steffen Knoblauch, Hao Li, Gengchen Mai, Konstantin Klemmer +2 more
The paper advocates for a paradigm shift toward joint Spatial Representation Learning (SRL) that unifies raster imagery and structured vector data into a single embedding space for developing more sem…
LALE introduces a novel lightweight architecture that efficiently combines local convolutional features and global transformer context for land-cover segmentation, achieving superior efficiency and pe…
The paper introduces CAFOSat, a large-scale, strongly annotated, and infrastructure-aware dataset designed to improve the accuracy of mapping Concentrated Animal Feeding Operations (CAFOs) from high-r…
DarkVesselNet is a novel multi-modal deep learning framework that fuses SAR, optical, and AIS data to accurately detect vessels that do not report their presence via Automatic Identification System (A…
Salim I. Amoukou, Emanuele Albini, Tom Bewley, Saumitra Mishra +1 more
The paper introduces Entropic Projection Alignment (EPA), a unified framework that estimates, explains, and improves model performance under distribution shift by aligning source and target distributi…
Adrián Cánovas-Rodriguez, Miguel A. González-Illán, Maria Fernanda García-Cruz, Pedro Nortes Tortosa +4 more
The paper proposes an attention-enhanced deep learning framework using EfficientNet and CBAM to achieve high accuracy (93.3%) in classifying peach leaf damage, demonstrating improved robustness under…
The paper proposes HADT, a novel transformer-based architecture using differential attention and relational tokenization, to enable adaptive and real-time autonomous resource management for heterogene…
This paper proposes a novel volumetric change detection sub-task for monitoring slope instabilities using time-lapse cameras, demonstrating that dense and semi-dense feature matching techniques are ro…
The paper proposes SafeDIG, a robust safety steering framework that adapts Diffusion Transformers for text-to-image generation by treating safety control as position-aware sparse feature transfer, ens…
The paper demonstrates that replacing standard pointwise losses (like MSE) with multi-quantile regression significantly improves precipitation nowcasting accuracy and provides valuable risk estimates…
The paper proposes a unified, constrained optimization framework using KL divergence and likelihood constraints to achieve effective and principled unlearning in diffusion models.
DASH introduces a dual-branch distillation framework to effectively compress class-conditional diffusion models by independently supervising both score branches, significantly preserving guidance fide…
Longxuan Yu, Yunshu Wu, Yu Fu, Siheng Xiong +4 more
The paper introduces DSL-LLaDA, a method that lightly adapts a pre-trained masked diffusion language model to perform continuous denoising in embedding space, significantly improving text generation q…
Pengfei Jin, Yiqi Tian, Kailong Fan, Bingjie Qi +1 more
The paper introduces Robust Prior Update (RPU), a module that improves the faithfulness of diffusion-based inverse solvers by stabilizing the prior update step, thereby reducing measurement-conditione…
Pengzhen Chen, Yanwei Liu, Xiaoyan Gu, Xiaojun Chen +2 more
Rel-Zero proposes a novel zero-watermarking technique that embeds invisible watermarks by exploiting the invariance of relational distances between image patches during AI editing, achieving superior…
Yuyang Zhao, Yicheng Pan, Qiyuan He, Jincheng Yu +5 more
SANA-Streaming introduces a novel, efficient framework that enables real-time, high-resolution streaming video-to-video editing by combining a hybrid diffusion transformer with specialized training an…
The paper demonstrates that current AI watermark removal techniques fail to achieve true forensic stealth, as the removal process often leaves behind detectable signals that distinguish the output fro…
This paper demonstrates that neural operators used in digital twins for nuclear systems are highly vulnerable to undetectable, sparse adversarial perturbations, necessitating new robustness guarantees…