~ similar to 2606.02515· 18 results
Udbhav Bamba, Arnav Chavan, Aryamaan Thakur, Steve Teig +1 more
DOT-MoE introduces a novel framework that treats the decomposition of dense layers into Mixture of Experts (MoE) as a Differentiable Optimal Transport problem, achieving superior efficiency while pres…
The paper develops a quantitative framework to analyze and improve flow distillation in diffusion models, providing stability guarantees and suggesting non-uniform time scheduling to reduce approximat…
This paper develops a perturbation theory for spherical Hellinger-Kantorovich (SHK) gradient flows, providing explicit, time-dependent bounds on divergence metrics to guarantee differential privacy fo…
The paper proposes a semi-relaxed Gromov-Wasserstein objective to estimate the latent connectivity structure of large-scale networks, achieving statistically consistent and efficient recovery of the u…
The paper proposes a novel active learning framework using Linearized Optimal Transport to strategically select measurement timepoints, thereby minimizing uncertainty when inferring continuous probabi…
The paper introduces a distributional framework using Wasserstein distance to unify the semantic comparison of sparse autoencoder features across different layers and to automatically compress large f…
The paper introduces a framework for composing deep probabilistic models using five specific factor-graph primitives that guarantee closed-form variational inference, thereby preserving tractability i…
Salim I. Amoukou, Emanuele Albini, Tom Bewley, Saumitra Mishra +1 more
The paper introduces Entropic Projection Alignment (EPA), a unified framework that estimates, explains, and improves model performance under distribution shift by aligning source and target distributi…
The paper introduces a computational framework using Hodge zero-modes to track the geometry of topological features in parameter-dependent data, providing metrics like curvature and holonomy to quanti…
The paper proposes a unified, constrained optimization framework using KL divergence and likelihood constraints to achieve effective and principled unlearning in diffusion models.
The paper uses majorization theory to analyze lattice reduction, showing that local swaps smooth the Gram-Schmidt profile and deriving variational and telescoping identities for the worst-case profile…
Yuanjian Xu, Jianing Hao, Wanbo Zhang, Zhong Li +1 more
The paper proposes DiReCT, a novel framework that treats data selection during LLM annealing as a constrained optimization problem based on the spectral geometry of the loss landscape, achieving state…
The paper introduces a diffusion-based uncertainty model for robust optimization on graphs, showing that the resulting computational complexity depends critically on the interaction between the uncert…
The paper proposes using pseudo-sensitivities, derived from adjoint sensitivity fields, as an optimal conditioning signal in a Bernoulli flow-matching framework to significantly improve the out-of-dis…
The paper introduces Complexity-Balanced Splitting (CBS), a framework that efficiently allocates model capacity across the diffusion timeline by focusing computational resources on the most complex ge…
The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved…
The paper demonstrates that enforcing a local conservative finite volume structure is crucial for achieving stable, accurate long-term autoregressive rollouts of plasma transport simulations, outperfo…
Zihan Li, Jialan Zheng, Ziyu Li, Xun Yuan +17 more
The paper introduces PIGMENT, a physics-informed foundation model that enables reliable quantitative mapping of brain microstructure from extremely sparse or challenging diffusion MRI scans.