"Generalization error" | ArxivCSExplorer

20 results for “Generalization error”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

math.STcs.LGmath.PREmpiricalRecentJun 4, 2026

How abundant are good interpolators?

This paper establishes a large deviation principle for the generalization error of interpolating classifiers in the overparametrized regime.

View →

cs.LGstat.MLTheoreticalRecentJun 9, 2026

Limitations of Learning Tanh Neural Networks with Finite Precision

Philipp Grohs, Matěj Trödler

This paper investigates limitations of learning tanh neural networks under finite-precision computations and Lp accuracy guarantees.

View →

cs.LGcs.AIRecentMay 29, 2026

Inconsistency-Aware Minimization: Improving Generalization with Unlabeled Data

Hee-Sung Kim, Hyeonseong Kim, Sungyoon Lee

The paper introduces Inconsistency-Aware Minimization (IAM), a novel training objective that uses a label-free measure called local inconsistency to improve model generalization, particularly in semi-…

View →

cs.LGcs.AIRecentMay 27, 2026

Learning Theory of the SVRG: Generalization and Convergence Analysis

Yunwen Lei, Zimeng Wang, Xiaoming Yuan

This paper provides the first non-vacuous generalization analysis for the Stochastic Variance Reduced Gradient (SVRG) method by establishing sharp, data-dependent algorithmic stability bounds, thereby…

View →

cs.LGcs.AIstat.MLRecentMay 30, 2026

A Practical Upper Bound on Selection Bias Effects in Medical Prediction Models

Kara Liu, Maggie Wang, Russ B. Altman

The paper proposes a novel, practical upper bound to estimate the worst-case performance of medical prediction models on the target population, even when the selection bias mechanism and target data a…

View →

cs.LGcs.AIcs.CVRecentMay 30, 2026

On the Difficulty of Learning a Meta-network for Training Data Selection

Zilin Du, Junqi Zhao, Boyang Albert Li

This paper analyzes the poor performance of Meta-learning for Training-data Selection (MTS) and proposes that increasing the batch size and incorporating informative features can significantly improve…

View →

cs.LGcs.IREmpiricalRecentJun 10, 2026

DeMix: Debugging Training Data with Mixed Data Error Types by Investigating Influence Vectors

Jiale Deng, Yanyan Shen, Xiaogang Shi, Chai Junjun

This paper proposes DeMix, a novel framework for simultaneously diagnosing erroneous samples and their error types in machine learning models.

View →

cs.CRRecentMay 6, 2026

Assessing Generalisation Capability of Machine Learning Models for Intrusion Detection

Md Zakir Hossain, Md Ayshik Rahman Khan, Md Rafiqul Islam, Syed Mohammed Shamsul Islam +1 more

The study assesses the generalization capability of supervised machine learning models for intrusion detection using UNSW-NB15 and TON_IoT, finding a significant performance drop when models are teste…

View →

cs.LGcs.AIRecentMay 28, 2026

Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment

Mustafa Uzun, Mete Erdogan, Cengiz Pehlevan, Alper T. Erdogan

The paper introduces Score Broadcast and Decorrelation (SBD), a general theoretical framework that unifies broadcast-based credit assignment across various differentiable loss functions by leveraging…

View →

cs.AIRecentMay 28, 2026

Quantifying and Optimizing Simplicity via Polynomial Representations

Tianren Zhang, Xiangxin Li, Minghao Xiao, Guanyu Chen +1 more

The paper introduces polynomial representations as a quantitative, distribution-aware metric for measuring model simplicity, demonstrating that the effective degree of this representation is a superio…

View →

cs.AIcs.CLRecentMay 27, 2026

The Importance of Being Statistically Earnest: A Critical Re-evaluation of GSM-Symbolic

Dominika Agnieszka Długosz, Arlindo Oliveira, Natalia Díaz-Rodríguez

The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…

View →

cs.LGcs.AIcs.CLRecentMay 27, 2026

Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents

Suji Kim, Kangsan Kim, Sung Ju Hwang

The paper introduces LearnWeak, an annotation-free framework that automatically specializes small computer-use agents by identifying and targeting their specific weaknesses using a stronger reference…

View →

math.STstat.MEstat.MLRecentJun 4, 2026

Optimally taming biases in black-box models for efficient semiparametric estimation

Yihong Gu, Qishuo Yin, Tianxi Cai, Jianqing Fan

The paper proposes a new, optimal estimator for semiparametric inference that improves upon standard double machine learning (DML) rates by eliminating the first-order stochastic error of nuisance fun…

View →

cs.LGcs.CLRecentMay 31, 2026

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher

Arda Uzunoglu, Alvin Zhang, Daniel Khashabi

The paper introduces trust functions to filter weak supervision labels, enabling near-lossless weak-to-strong generalization by selectively training a strong student using only the most reliable weak…

View →

cs.CLcs.AIRecentJun 1, 2026

From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression

Elia Cunegatti, Marcus Vukojevic, Erik Nielsen, Giovanni Iacca

The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…

View →

cs.LGcs.AIcs.SDRecentMay 30, 2026

Logit Distillation on Manifolds: Mapping by Learning

Yiru Yang, Junling Wang, Nishant Kumar Singh, Luohong Wu +1 more

The paper proposes a novel layer and point-wise projection mapping combined with LoRA injection to efficiently distill knowledge from a large teacher model to a small student model, significantly impr…

View →

stat.MLcs.LGRecentJun 1, 2026

Doing well with less! On Sampling Techniques for Empirical Pairwise Loss Estimation/Minimization

Louise Davy, Stephan Clémençon, Charlotte Laclau

This paper introduces survey sampling techniques to estimate or minimize empirical pairwise loss functions, showing that targeting informative pairs significantly reduces computational cost while main…

View →

cs.DCcs.AIRecentJun 1, 2026

Not All Errors Are Equal: A Systematic Study of Error Propagation in Large Language Model Inference

Yafan Huang, Sheng Di, Guanpeng Li

This paper systematically studies how soft errors propagate during Large Language Model (LLM) inference using a novel fault-injection framework, providing critical insights and mitigation strategies f…

View →

cs.LGcs.AIcs.CVRecentMay 30, 2026

SORA: Free Second-Order Attacks in Fast Adversarial Training

Mazdak Teymourian, Ramtin Moslemi, Farzan Rahmani, Mohammad Hossein Rohban

The paper introduces SORA, an adaptive adversarial training method that dynamically adjusts perturbation sizes to prevent Catastrophic Overfitting, achieving state-of-the-art robustness and clean accu…

View →

cs.CRRecentMar 26, 2026

LiteGuard: Efficient Task-Agnostic Model Fingerprinting with Enhanced Generalization

Guang Yang, Ziye Geng, Yihang Chen, Changqing Luo

LiteGuard proposes an efficient task-agnostic model fingerprinting framework that achieves enhanced generalization and significantly reduces computational overhead compared to existing methods like Me…

View →