~ similar to 2605.30818· 16 results
Wanhao Liu, Jiaqing Xie, Qian Tan, Weida Wang +9 more
The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) h…
This review surveys advanced techniques—including generative models, multimodal learning, and closed-loop workflows—for automated inverse materials design, enabling the targeted discovery of novel cry…
This paper investigates a novel vulnerability in tactile sensing by demonstrating that targeted Electromagnetic Interference (EMI) can induce strong, misleading 'phantom forces' in Hall-effect fingert…
EigeNet introduces a geometry-informed multi-modal Transformer framework to achieve state-of-the-art few-shot novel view Room Impulse Response (RIR) prediction by effectively integrating spatial geome…
The paper proposes two novel CAPTCHA types—ASCII art and overlapping audio—and demonstrates that current frontier LLMs struggle significantly to solve them, suggesting they are highly effective anti-b…
Shih-Yu Lai, Hirozumi Yamaguchi, Shang-Tse Chen, Yu-Lun Liu +1 more
UMEDA introduces a novel graph federated learning framework that uses spectral signal processing and diffusion models to enable privacy-preserving, robust localization across clients with highly heter…
The paper proposes a novel multimodal learning approach to predict the properties of new bilayer 2D materials formed by stacking dissimilar functional layers.
The paper introduces Partial Information Decomposition (PID) to quantitatively separate unique, redundant, and synergistic contributions of different modalities (e.g., vision, language) in multimodal…
Ben Wang, Xiaogang Li, Ruochen Gao, Peiyao Xiao +5 more
The paper introduces BilliardPhys-Bench, a new benchmark that demonstrates that current multimodal LLMs struggle with complex physical reasoning and predicting object dynamics in simulated environment…
Zixin Zhang, Fan Qi, Shuai Li, Xiaoshan Yang +1 more
The paper proposes FedMChain, a novel federated learning framework that structures multimodal training into sequential phases to mitigate modality competition and improve model performance while reduc…
The paper introduces a novel padding method that leverages crystal symmetry to enhance the encoding of complex inorganic structures, significantly improving the generation of stable, novel materials.
The paper demonstrates a coordinated, cross-modal spoofing attack that successfully deceives state-of-the-art multi-sensor fusion systems in autonomous vehicles by making multiple sensors agree on a f…
MASER is a lightweight framework that dynamically routes a shared Vision-Language Model (VLM) to the most appropriate modality-specific adapter (e.g., point cloud, RGB) based on the input question, si…
Yuming Zhao, Junhui Hou, Qijian Zhang, Jia Qin +1 more
The paper introduces PRISM, a novel representation learning framework that learns isometric embeddings by explicitly modeling the intrinsic geodesic metric of 3D surfaces, achieving superior performan…
Yule Liu, Yilong Yang, Jiale Teng, Hanze Jia +10 more
The paper systematically measures the risk of current image-to-3D models generating harmful geometries, finding that these models are effective at reconstruction and existing safeguards are insufficie…
Ioannis Prokopiou, Pantelis Vikatos, Maximos Kaliakatsos-Papakostas, Theodoros Giannakopoulos +1 more
The paper proposes an inference-time activation steering framework, utilizing orthogonalization, to achieve fine-grained, deterministic control over discrete musical attributes like Pitch and Duration…