Papers similar to 2605.29153

~ similar to 2605.29153· 19 results

cs.LGcs.AIRecentMay 28, 2026

Do Physics Foundation Models Learn Generalizable Physics? A Bias-Aware Benchmark Across Physical Regimes and Distribution Shifts

Mengdi Chu, Yang Liu, Ayan Biswas, Han-Wei Shen

The paper introduces a comprehensive benchmark to test if physics foundation models learn generalizable dynamics, finding that their performance is highly conditional and not universally general.

View →

cs.LGcs.AImath.OCRecentMay 28, 2026

Singularity-aware Optimization via Randomized Geometric Probing: Towards Stable Non-smooth Optimization

Ruoran Xu, Borong She, Xiaobo Jin, Qiufeng Wang

The paper introduces Singularity-aware Adam (S-Adam), a novel optimizer that stabilizes deep learning training in non-smooth loss landscapes by dynamically damping updates based on local geometric ins…

View →

cs.CRRecentMay 14, 2026

Defenses at Odds: Measuring and Explaining Defense Conflicts in Large Language Models

Xiangtao Meng, Wenyu Chen, Chuanchao Zang, Xinyu Gao +4 more

This paper systematically measures and explains how sequential model defenses can conflict, finding that 38.9% of ordered defense sequences cause measurable risk exacerbation due to anti-aligned param…

View →

cs.AIcs.CLcs.CRRecentApr 27, 2026

An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress

Hikmat Karimov, Rahid Zahid Alekberli

The paper proposes a novel information-geometric framework to analyze LLM stability by integrating task utility, external entropy, and internal structural proxies, showing this composite score improve…

View →

cs.LGcs.AIstat.MLRecentMay 28, 2026

On the Optimizer Dependence of Neural Scaling Laws

Vansh Ramani, Shourya Vir Jain

The scaling exponent in neural scaling laws is not fixed but systematically depends on the optimizer used, with preconditioned optimizers generally yielding steeper scaling.

View →

cs.AIcs.LGRecentMay 29, 2026

Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents

Yunpeng Zhou

This paper analyzes failure modes in collaborative visual reasoning systems, demonstrating that naive shared workspaces can amplify hallucinations and proposing diagnostics for improving communication…

View →

cs.AIRecentMay 31, 2026

The Case for Model Science: Verify, Explore, Steer, Refine

Przemyslaw Biecek, Luca Longo, Jianlong Zhou, Thomas Fel +2 more

The paper advocates for the establishment of Model Science, a systematic discipline that moves beyond simple benchmarking to deeply analyze AI models' internal workings and failure modes.

View →

cs.LGcs.AIcs.CVRecentJun 1, 2026

Rethinking Evaluation Paradigms in IBP-based Certified Training

Konstantin Kaulen, Hadar Shavit, Holger H. Hoos

The paper proposes evaluating certified training methods by comparing their Pareto fronts across the natural-certified accuracy trade-off, revealing superior performance and previously unappreciated c…

View →

cs.LGcs.AIRecentMay 31, 2026

Physics-Informed Deep Learning for Entropy Prediction in Heterogeneous Systems: Thermodynamic and Information-Theoretic Case Studies

Biswajeet Sahoo, Debadutta Patra

The paper introduces a unified Physics-Informed Deep Learning (PIDL) framework that simultaneously enforces physical laws and information-theoretic bounds, demonstrating robust, domain-agnostic entrop…

View →

cs.NEmath.APmath.PRRecentJun 4, 2026

Quantifying Uncertainty In Wide Two-Layer Neural Networks: On The Law Of The Limiting Fluctuation Process

Arnaud Descours, Arnaud Guillin, Geoffrey Lacour, Manon Michel +2 more

This paper develops a novel, computationally efficient method to quantify the uncertainty in wide neural network predictions by characterizing the limiting random fluctuations using stochastic evoluti…

View →

cs.CRcs.AIcs.LGRecentMay 24, 2026

Furina: Fragmented Uncertainty-Driven Refusal Instability Attack

Tongxi Wu, Jian Zhang, Yang Gao

The paper challenges the assumption that LLM safety is a binary threshold, proposing that safety failures occur in an 'instability region' and introducing Furina, a transferable attack that exploits t…

View →

cs.LGcs.CLRecentMay 30, 2026

Escaping the Mode Lottery: Multi-Response Training Improves Language Model Generalization

Hasan Amin, Kian Ahrabian, Ming Yin, Rajiv Khanna

The paper introduces Multi-Response Training (MRT) to combat the 'mode lottery' problem in language model fine-tuning, showing that retaining multiple valid responses significantly improves distributi…

View →

cs.AIcs.LGRecentMay 27, 2026

Adaptive Reservoir Computing for Multi-Scenario Chaotic System Forecasting

Shadmehr Zaregarizi, Khashayar Yavari

The paper introduces an adaptive reservoir computing framework that tailors Echo State Networks (ESNs) to specific evaluation scenarios, achieving a high score on the CTF-4-Science Lorenz benchmark fo…

View →

cs.LGcs.CLRecentJun 1, 2026

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Lei Yang, Siyu Ding, Deyi Xiong

The paper proposes a local perturbation theory showing that cross-domain interference in multi-domain RL occurs via a low-dimensional shared conflict subspace, which can be selectively mitigated by sh…

View →

cs.CRRecentApr 2, 2026

PARD-SSM: Probabilistic Cyber-Attack Regime Detection via Variational Switching State-Space Models

Prakul Sunil Hiremath, PeerAhammad M Bagawan, Sahil Bhekane

PARD-SSM is a probabilistic framework that models network traffic as a switching state-space system to detect multi-stage cyber-attacks in real-time with high accuracy and predictive capability.

View →

cs.AIcs.LGRecentMay 28, 2026

When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop

Yang Zhang, Xiukun Wei, Xueru Zhang

This paper analyzes multi-model self-consuming training, showing that while human curation helps individual models, cross-model interactions can degrade long-term alignment by dampening or inverting t…

View →

cs.LGcs.AImath.NARecentMay 27, 2026

Hybrid Neural World Models

Pranav Lakshmanan, Paras Chopra

The paper introduces hybrid neural world models that provide fast, multi-horizon predictions for complex physical dynamics, implicitly handling sharp events like shocks and contacts without explicit t…

View →

math.NAcs.LGRecentJun 1, 2026

Spectral Audit of In-Context Operator Networks

Zhiwei Gao, Liu Yang, George Em Karniadakis

The paper introduces a Jacobian-based spectral audit to evaluate neural operators, demonstrating that standard prediction error metrics fail to capture crucial local dynamical structures and operator…

View →

cs.CLcs.LGRecentMay 29, 2026

Cognitive Fatigue in Autoregressive Transformers: Formalization and Measurement

Riju Marwah, Ritvik Garimella, Vishal Pallagani, Atishay Jain +2 more

The paper formalizes LLM degradation during long generation as 'cognitive fatigue' and introduces the Fatigue Index (FI), a measurable, model-agnostic diagnostic tool for real-time monitoring.

View →