~ similar to 2606.05982· 17 results
This paper presents a unified framework for end-to-end co-design of neural network processors.
This paper establishes an exact mathematical correspondence between training and inference in deep learning and the solution of Hamilton-Jacobi partial differential equations, unifying multiple theore…
The paper introduces the Computation-Aware State-Space Model (CASSM), a novel framework that extends Bayesian methods to handle model selection and large state-spaces, achieving competitive performanc…
The paper reformulates nonreversible perturbations of Fokker--Planck dynamics as gauge fields, providing a unified operator viewpoint to analyze relaxation processes and develop methods for learning o…
The paper introduces an adaptive reservoir computing framework that tailors Echo State Networks (ESNs) to specific evaluation scenarios, achieving a high score on the CTF-4-Science Lorenz benchmark fo…
The scaling exponent in neural scaling laws is not fixed but systematically depends on the optimizer used, with preconditioned optimizers generally yielding steeper scaling.
The paper establishes that the training process of fully connected deep neural networks (DNNs) on exponential family data is mathematically equivalent to performing a Renormalization Group (RG) calcul…
This paper investigates limitations of learning tanh neural networks under finite-precision computations and Lp accuracy guarantees.
This paper investigates limitations of learning tanh neural networks under finite-precision computations and Lp accuracy guarantees.
The paper introduces a unified Physics-Informed Deep Learning (PIDL) framework that simultaneously enforces physical laws and information-theoretic bounds, demonstrating robust, domain-agnostic entrop…
The paper develops a quantitative framework to analyze and improve flow distillation in diffusion models, providing stability guarantees and suggesting non-uniform time scheduling to reduce approximat…
This paper provides the first non-vacuous generalization analysis for the Stochastic Variance Reduced Gradient (SVRG) method by establishing sharp, data-dependent algorithmic stability bounds, thereby…
The paper introduces a Jacobian-based spectral audit to evaluate neural operators, demonstrating that standard prediction error metrics fail to capture crucial local dynamical structures and operator…
Yuxin Wang, Yuanzhe Hu, Xiaokun Zhong, Xiaopeng Wang +6 more
This paper analyzes the multi-regime behavior of Scientific Machine Learning (SciML) models, finding that optimization effectiveness is regime-specific and that failure modes require a unified, regime…
The paper introduces partial multi-neuron relaxation, a novel verification technique that selectively computes tight linear bounds for a small subset of neurons to improve the efficiency and tightness…
The paper proposes a stochastic risk-aware optimization framework for covert quantum communication, significantly improving throughput and expanding feasible operating regions under realistic channel…
The paper analyzes the phase transitions of the noisy transformer model on the unit sphere, proving a sharp global-minimizer dichotomy that depends on the dimension and coupling strength.