~ similar to 2606.02292· 17 results
The paper proposes a fast and lightweight novel view synthesis method using a differentiable Multiplane Image (MPI) representation, achieving significant speed and size improvements over state-of-the-…
The paper introduces Staged Executable Inverse Graphics (SEIG), an agentic framework that uses general-purpose Vision-Language Models (VLMs) to reconstruct editable 3D scenes directly into executable…
The paper proposes a novel method to improve the simultaneous representation of appearance and geometry in 3D Gaussian Splatting by introducing an additional geometry opacity parameter.
Zihan Li, Jialan Zheng, Ziyu Li, Xun Yuan +17 more
The paper introduces PIGMENT, a physics-informed foundation model that enables reliable quantitative mapping of brain microstructure from extremely sparse or challenging diffusion MRI scans.
The paper introduces an adaptive feature-optimized vision front end that intelligently selects and budgets visual features for 3D reconstruction, significantly improving reconstruction quality and com…
Yusuke Ohtsubo, Kota Dohi, Koichiro Yawata, Koki Takeshita +1 more
The paper proposes a visual program synthesis framework using a VLM to generate accurate training data for semiconductor inspection, mitigating the sim-to-real gap by applying input binarization to st…
Lu Liu, Huiyu Duan, Chenxin Zhu, Jintong Lu +5 more
The paper introduces LL-Bench, a comprehensive benchmark for evaluating large-scale generative models on low-level vision tasks, and proposes LL-Score, an MLLM-based evaluator that better aligns quali…
Pengfei Jin, Yiqi Tian, Kailong Fan, Bingjie Qi +1 more
The paper introduces Robust Prior Update (RPU), a module that improves the faithfulness of diffusion-based inverse solvers by stabilizing the prior update step, thereby reducing measurement-conditione…
The paper introduces a subgrid marching tetrahedra scheme that accurately recovers complex, intersection-free manifold meshes from tetrahedral grids, overcoming limitations of classic marching methods…
The paper proposes a real-time, predictive, and task-aware foveated imaging system that dynamically allocates limited sensor bandwidth to task-relevant regions of interest, significantly improving per…
Reasmory introduces a structured programming framework that uses explicit 3D memory and a Domain-Specific Language (DSL) to reliably enhance Vision-Language Models' spatial reasoning capabilities, ach…
Zhiwei Chen, Yijie Li, Yimo Zhang, Shiyun Shao +8 more
GaMi is a multimodal material identification system that uses mmWave and acoustic sensing with a cross-modal subtractive disentanglement framework to achieve high accuracy (95.2%) for material identif…
Aoduo Li, Jiancheng Li, Huan Ye, Hongjian Xu +4 more
VEDAL introduces a variational, error-driven asynchronous learning framework to efficiently prune 3D Gaussian Splatting, achieving high compression ratios with minimal loss in novel view synthesis qua…
RefDiffNet is a lightweight, plug-and-play module that enhances PCB defect detection by comparing the defective image to a defect-free reference image, significantly improving detection accuracy with…
This paper proposes using color statistics, specifically through novel color transformations, to detect AI-generated synthetic images by exploiting the color-imitation weaknesses of current generative…
This review surveys advanced techniques—including generative models, multimodal learning, and closed-loop workflows—for automated inverse materials design, enabling the targeted discovery of novel cry…
The paper introduces a Mixture-Density Representation (MDA) to model depth ambiguity, effectively eliminating 'flying-point' artifacts at object boundaries by allowing pixels to predict multiple possi…