~ similar to 2606.03831· 18 results
This paper introduces Repeated Policy Regret (RP-Regret), a novel game-theoretic metric for analyzing regret in repeated games with adaptive opponents, and proposes algorithms to minimize it.
Liad Erez, Fan Chen, Alon Cohen, Tomer Koren +3 more
The paper analyzes the sample complexity of contextual bandits in the $s$-sparse setting, achieving optimal sample bounds for identifying an $\epsilon$-optimal policy.
The paper introduces MINTS, a minimalist Bayesian framework that simplifies sequential decision-making by placing priors only on the optimum location, allowing for the incorporation of structural cons…
The paper analyzes a new class of asynchronous adaptive first-order optimization methods and proves their stochastic convergence rate is O(1/sqrt{t}) for non-convex functions.
The paper establishes tight upper and lower bounds on the statistical cost of approximate machine unlearning for smooth strongly convex losses, showing that the optimal unlearning rate depends critica…
This paper provides the first non-vacuous generalization analysis for the Stochastic Variance Reduced Gradient (SVRG) method by establishing sharp, data-dependent algorithmic stability bounds, thereby…
The paper introduces the Markov decision contest, a new framework for reinforcement learning using pairwise preferences, and proves that stationary Markov policies are optimal and solvable efficiently…
The paper introduces a new anytime-valid inference method to correct split selection in online decision trees, providing robust statistical guarantees for streaming data that existing methods lack.
Johanna Menn, Miriam Kober, Paul Brunzema, David Stenger +1 more
The paper introduces local Preferential Bayesian Optimization (PBO) methods that adapt high-dimensional Bayesian Optimization techniques, such as trust-region and derivative-informed local search, to…
The paper introduces Singularity-aware Adam (S-Adam), a novel optimizer that stabilizes deep learning training in non-smooth loss landscapes by dynamically damping updates based on local geometric ins…
Dongjun Kim, Adrian de Wynter, Huancheng Chen, Heasung Kim +1 more
The paper introduces FoLoRA, a novel optimization framework that uses a generalized Rayleigh quotient to achieve a superior balance between adapting foundation models to specific tasks and preserving…
This paper introduces survey sampling techniques to estimate or minimize empirical pairwise loss functions, showing that targeting informative pairs significantly reduces computational cost while main…
Zhi Zhou, Ming Yang, Shi-Yu Tian, Kun-Yang Yu +2 more
The paper establishes the first theoretical framework for analyzing the learnability of Test-Time Adaptation (TTA) under non-stationary data streams by introducing Recovery Complexity, which quantifie…
The paper develops an optimistic maximum-likelihood algorithm that achieves $ ilde{O}(\sqrt{T})$ policy regret for sequential decision-making in partially observable Markov games against adaptive oppo…
Yuanjian Xu, Jianing Hao, Wanbo Zhang, Zhong Li +1 more
The paper proposes DiReCT, a novel framework that treats data selection during LLM annealing as a constrained optimization problem based on the spectral geometry of the loss landscape, achieving state…
The paper introduces Posterior Hybrid Bayesian Belief (PhyB), a novel framework that reformulates policy optimization in Bayesian Offline RL by approximating expectations as a convex combination over…
Zakk Heile, Hayden McTavish, Varun Babbar, Margo Seltzer +1 more
The paper introduces PRAXIS, a novel algorithm that efficiently approximates the computation of 'Rashomon sets' for decision trees, significantly reducing memory and runtime complexity.
This paper analyzes the poor performance of Meta-learning for Training-data Selection (MTS) and proposes that increasing the batch size and incorporating informative features can significantly improve…