~ similar to 2606.01023· 20 results
The paper introduces a structured benchmark (TGAD) showing that current text-guided anomaly detection models often overstate their language conditioning, as performance significantly degrades when the…
The paper proposes a graph attention-based virtual metrology framework that accurately predicts film thickness in semiconductor deposition by modeling structured, directional dependencies among hetero…
Wanhao Liu, Jiaqing Xie, Qian Tan, Weida Wang +9 more
The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) h…
The paper reframes industrial visual sim-to-real transfer as a domain-gap problem categorized by the availability of explicit object geometry (CAD), arguing that the required prior evidence dictates t…
RefDiffNet is a lightweight, plug-and-play module that enhances PCB defect detection by comparing the defective image to a defect-free reference image, significantly improving detection accuracy with…
Marisa Ferrara Boston, Glen Hanson, Effi Georgala, JD Hudgens +1 more
The paper proposes a comprehensive monitoring and triage methodology for agentic systems, demonstrating that structural defects mask task-level errors and require specialized monitoring scopes for det…
Xinyu Yan, Boyang Chen, Jiaming Zhang, Tiantong Wu +11 more
The paper introduces FraudBench, a multimodal benchmark designed to detect AI-generated fraudulent refund evidence, finding that current AI models struggle significantly with claim-conditioned fake-da…
Qian Kou, Xiaofeng Shi, Yulin Li, Xiaosong Qiu +3 more
The paper introduces MechVQA, a comprehensive dataset and benchmark for mechanical drawing understanding, and proposes the MechVL model, which significantly improves Multimodal LLMs' performance on th…
Yule Liu, Yilong Yang, Jiale Teng, Hanze Jia +10 more
The paper systematically measures the risk of current image-to-3D models generating harmful geometries, finding that these models are effective at reconstruction and existing safeguards are insufficie…
The paper introduces MUSE, a comprehensive benchmark that evaluates Text-to-CAD generation by assessing complex assemblies based on functionality, manufacturability, and assemblability, moving beyond…
Adrián Cánovas-Rodriguez, Miguel A. González-Illán, Maria Fernanda García-Cruz, Pedro Nortes Tortosa +4 more
The paper proposes an attention-enhanced deep learning framework using EfficientNet and CBAM to achieve high accuracy (93.3%) in classifying peach leaf damage, demonstrating improved robustness under…
Christian Gehrmann, Jonas Ricker, Simon Damm, Deruo Cheng +4 more
The paper introduces SAMSEM, a generalized and scalable model based on SAM2, which significantly improves metal line segmentation across diverse and unseen integrated circuit (IC) samples.
This paper proposes using color statistics, specifically through novel color transformations, to detect AI-generated synthetic images by exploiting the color-imitation weaknesses of current generative…
The paper demonstrates that using synthetic hand images containing accessories, generated via inpainting, significantly improves the robustness of hand detectors for safety-critical applications by cl…
The paper introduces CalArena, a large-scale, standardized benchmark covering nearly 2000 experiments to comprehensively evaluate post-hoc calibration methods, finding that smooth calibration function…
The paper introduces an agentic, framework-based system to transform under-specified academic papers into standardized, comparable, and executable benchmarks for industrial Prognostics and Health Mana…
The paper introduces RADAR, a risk-aware automated code review system, demonstrating that it can significantly reduce review bottlenecks and improve efficiency for AI-generated code without compromisi…
Yusuke Ohtsubo, Kota Dohi, Koichiro Yawata, Koki Takeshita +1 more
The paper proposes a visual program synthesis framework using a VLM to generate accurate training data for semiconductor inspection, mitigating the sim-to-real gap by applying input binarization to st…
This study systematically evaluates Vision Mamba models for detecting AI-generated images, finding that while they show promise, their current strengths and limitations must be understood relative to…
Jiachen Zhang, Junyi Lao, Chenghao Liu, Siyuan Liu +4 more
VFEAgent is a novel multi-agent framework that automates the entire Finite Element Analysis (FEA) workflow, achieving high success rates in generating complete and physically valid simulations directl…