~ similar to 2605.28983· 18 results
The paper introduces a Jacobian-based spectral audit to evaluate neural operators, demonstrating that standard prediction error metrics fail to capture crucial local dynamical structures and operator…
This book provides a compact, derivation-oriented mathematical primer that connects major families of generative AI models, showing their underlying structural relationships.
The paper introduces Singularity-aware Adam (S-Adam), a novel optimizer that stabilizes deep learning training in non-smooth loss landscapes by dynamically damping updates based on local geometric ins…
Arnaud Descours, Arnaud Guillin, Geoffrey Lacour, Manon Michel +2 more
This paper develops a novel, computationally efficient method to quantify the uncertainty in wide neural network predictions by characterizing the limiting random fluctuations using stochastic evoluti…
The paper analyzes congruence-based neural architectures for classifying positive-definite matrices, demonstrating that common semi-orthogonality constraints severely limit the model's expressivity.
The paper investigates applying Riemannian optimization techniques to low-rank matrix parameters for deep learning, but finds that the proposed methods do not conclusively outperform the AdamW baselin…
The paper establishes that the training process of fully connected deep neural networks (DNNs) on exponential family data is mathematically equivalent to performing a Renormalization Group (RG) calcul…
The paper introduces a unified Physics-Informed Deep Learning (PIDL) framework that simultaneously enforces physical laws and information-theoretic bounds, demonstrating robust, domain-agnostic entrop…
The paper introduces hybrid neural world models that provide fast, multi-horizon predictions for complex physical dynamics, implicitly handling sharp events like shocks and contacts without explicit t…
The paper introduces Cellular Sheaf Neural Operators, a discretization-aware framework that models constrained PDEs by representing physical states on oriented cell complexes to enforce structure-pres…
The paper reformulates nonreversible perturbations of Fokker--Planck dynamics as gauge fields, providing a unified operator viewpoint to analyze relaxation processes and develop methods for learning o…
Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more
This paper proposes a preconditioning layer for stable weight conditioning in LLM training.
Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more
This paper proposes a preconditioning layer for stable weight conditioning in LLM training.
The paper proposes PG-RSSNN, a physics-guided recurrent state-space neural network that improves multi-step prediction stability and accuracy compared to both pure black-box and pure physics models, e…
The paper introduces a non-intrusive variant of index-aware learning for solving differential-algebraic equations (DAEs), ensuring that learned solutions maintain physical consistency like Kirchhoff's…
The paper introduces an adaptive reservoir computing framework that tailors Echo State Networks (ESNs) to specific evaluation scenarios, achieving a high score on the CTF-4-Science Lorenz benchmark fo…
This paper provides the first non-vacuous generalization analysis for the Stochastic Variance Reduced Gradient (SVRG) method by establishing sharp, data-dependent algorithmic stability bounds, thereby…
The paper introduces and analyzes several novel data appraisal metrics, including the Vendi Score and matrix spectral functions, demonstrating that efficient optimization techniques make these metrics…