Papers similar to 2605.29761

~ similar to 2605.29761· 20 results

cs.CVRecentJun 1, 2026

From Extrinsic to Intrinsic: Geodesic-Guided Representation Learning for 3D Geometric Data

Yuming Zhao, Junhui Hou, Qijian Zhang, Jia Qin +1 more

The paper introduces PRISM, a novel representation learning framework that learns isometric embeddings by explicitly modeling the intrinsic geodesic metric of 3D surfaces, achieving superior performan…

View →

cs.CRcs.CVRecentMay 10, 2026

On the Generation and Mitigation of Harmful Geometry in Image-to-3D Models

Yule Liu, Yilong Yang, Jiale Teng, Hanze Jia +10 more

The paper systematically measures the risk of current image-to-3D models generating harmful geometries, finding that these models are effective at reconstruction and existing safeguards are insufficie…

View →

cs.GRcs.CGRecentMay 30, 2026

Subgrid Marching Tetrahedra

Hossein Baktash, Mark Gillespie, Keenan Crane

The paper introduces a subgrid marching tetrahedra scheme that accurately recovers complex, intersection-free manifold meshes from tetrahedral grids, overcoming limitations of classic marching methods…

View →

cs.AIcs.DBcs.IRRecentMay 29, 2026

Vector Linking via Cross-Model Local Isometric Consistency

Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more

The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…

View →

cs.CVRecentJun 1, 2026

Honey, I Shrunk the Arc de Triomphe!

Yuanbo Xiangli, Hanyu Chen, Xueqing Tsang, Noah Snavely

The paper introduces MetricScenes, a new large-scale, in-the-wild dataset, and demonstrates that fine-tuning existing geometry models on this dataset significantly mitigates the scale-collapse problem…

View →

cs.CVcs.AIRecentMay 28, 2026

xModel-KD: Cross-modal Knowledge Distillation for 3D Scene Perception using LiDAR

Thenukan Pathmanathan, Kanchan Keisham, Thangarajah Akilan

The paper proposes xModel-KD, a cross-modal knowledge distillation framework, to improve 3D point cloud segmentation by effectively transferring rich appearance cues from 2D images to sparse 3D geomet…

View →

cs.CVcs.RORecentJun 1, 2026

Not All Points Are Equal: Uncertainty-Aware 4D LiDAR Scene Synthesis

Xiang Xu, Alan Liang, Youquan Liu, Xian Sun +4 more

The paper introduces U4D, an uncertainty-aware framework that synthesizes 4D LiDAR scenes by prioritizing the reconstruction of geometrically difficult and uncertain regions first, leading to state-of…

View →

cs.CVcs.AIRecentMay 30, 2026

GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video

Arun Sharma

GeoSAM-3D proposes a novel framework for open-vocabulary 3D scene segmentation from simple monocular video by propagating object prompts using a geodesic distance kernel on a reconstructed Gaussian sc…

View →

cs.CVcs.AIcs.GRRecentMay 28, 2026

City-Mesh3R: Simulation-Ready City-Scale 3D Mesh Reconstruction from Multi-View Images

Sayan Paul, Sourav Ghosh, Siddharth Katageri, Soumyadip Maity +2 more

City-Mesh3R is a scalable, end-to-end framework that reconstructs high-fidelity, watertight 3D surface meshes of entire city-scale environments directly from large collections of multi-view images.

View →

cs.CVRecentJun 3, 2026

Controllable Dynamic 3D Shape Generation via 3D Trajectories and Text

Jaeyeong Kim, Ines Kim, Jahyeok Koo, Seungryong Kim

T2Mo is a novel framework that generates controllable dynamic 3D object shapes by combining explicit 3D trajectories for spatial guidance with natural language text semantics.

View →

cs.CVcs.AIeess.IVRecentJun 1, 2026

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization

Jingyun Liang, Min Wei, Shikai Li, Yizeng Han +4 more

The paper proposes a novel render-free framework that conditions video diffusion models directly on compressed 3D human mesh tokens, enabling robust 3D-aware human motion control without relying on re…

View →

cs.CVcs.AIcs.GRRecentMay 31, 2026

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

Yipeng Gao, Lei Shu, Genzhi Ye, Xi Xiong +4 more

The paper introduces 3DCodeBench, a systematic benchmark and platform for evaluating Vision-Language Model (VLM) agents' ability to generate procedural 3D models from text and images using code.

View →

cs.CVcs.AIRecentJun 3, 2026

GeM-NR: Geometry-Aware Multi-View Editing for Nonrigid Scene Changes

Josef Bengtson, Yaroslava Lochman, Fredrik Kahl

GeM-NR proposes a novel, training-free framework to achieve general multi-view image editing, enabling consistent edits that drastically change both the geometry and appearance of a nonrigid scene.

View →

cs.CVcs.AIcs.RORecentMay 28, 2026

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

Chenxi Tao, Seung-Kyum Choi

The paper reframes industrial visual sim-to-real transfer as a domain-gap problem categorized by the availability of explicit object geometry (CAD), arguing that the required prior evidence dictates t…

View →

cs.AIRecentJun 1, 2026

WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

Shuo Lu, Yinuo Xu, Kecheng Yu, Siru Jiang +7 more

The paper introduces WorldCoder-Bench, a comprehensive benchmark and evaluation protocol for testing LLMs' ability to autonomously generate complex, physically grounded, and interactive 3D web worlds.

View →

cs.CRRecentApr 23, 2026

Cross-Modal Phantom: Coordinated Camera-LiDAR Spoofing Against Multi-Sensor Fusion in Autonomous Vehicles

Shahriar Rahman Khan, Raiful Hasan

The paper demonstrates a coordinated, cross-modal spoofing attack that successfully deceives state-of-the-art multi-sensor fusion systems in autonomous vehicles by making multiple sensors agree on a f…

View →

cs.CVcs.CRcs.LGRecentApr 14, 2026

PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction

Prajas Wadekar, Venkata Sai Pranav Bachina, Kunal Bhosikar, Ankit Gangwal +1 more

PatchPoison introduces a lightweight dataset-poisoning method that injects small, high-frequency adversarial patches into multi-view image datasets to systematically corrupt feature matching and degra…

View →

cs.CVcs.AIRecentJun 1, 2026

Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation

Siyuan Bian, Congrong Xu, Jun Gao

The paper introduces a Mixture-Density Representation (MDA) to model depth ambiguity, effectively eliminating 'flying-point' artifacts at object boundaries by allowing pixels to predict multiple possi…

View →

cs.CVcs.CRcs.LGRecentMay 29, 2026

Latent Geometric Chords for Query-Efficient Decision-Based Adversarial Attacks

Ei Hmue Khine, Yao Li, Jiebao Sun, Shengzhu Shi +2 more

The paper proposes Latent Geometric Chords (LGC) and LGC-H, a novel method that navigates decision boundaries using curvature-aware geometric search within a semantic manifold to generate high-fidelit…

View →

cs.CVcs.AIcs.CRRecentMay 26, 2026

Rotation-Invariant Spherical Watermarking via Third-Order SO(3) Representation Coupling

Pengzhen Chen, Yanwei Liu, Xiaoyan Gu, Antonios Argyriou +2 more

The paper introduces a novel third-order, rotation-invariant spherical bispectrum for watermarking panoramic images, enabling reliable watermark embedding and extraction under arbitrary 3D rotations.

View →