20 results for “Overparametrization”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
This paper establishes a large deviation principle for the generalization error of interpolating classifiers in the overparametrized regime.
Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more
This paper proposes a preconditioning layer for stable weight conditioning in LLM training.
This paper investigates limitations of learning tanh neural networks under finite-precision computations and Lp accuracy guarantees.
The paper introduces SORA, an adaptive adversarial training method that dynamically adjusts perturbation sizes to prevent Catastrophic Overfitting, achieving state-of-the-art robustness and clean accu…
The paper demonstrates that Low-Rank Adaptation (LoRA) is an effective and superior method for adapting large, pretrained Transformer surrogates for automotive aerodynamics to new vehicle families usi…
This paper investigates the application of Parameter-Efficient Fine-Tuning (PEFT) methods, specifically adapters and LoRA, to large pretrained models for instance segmentation, demonstrating that thes…
The paper proposes using pseudo-sensitivities, derived from adjoint sensitivity fields, as an optimal conditioning signal in a Bernoulli flow-matching framework to significantly improve the out-of-dis…
The paper introduces and explores Truly Linear FPT (TLFPT), a complexity class defined by $O(n) + f(k)$, demonstrating that it is a strict subset of standard Linear FPT and providing new algorithms fo…
Mengnan Zhao, Lihe Zhang, Bo Wang, Tianhang Zheng +2 more
The paper proposes a Distribution-aware Dynamic Guidance (DDG) strategy to mitigate catastrophic overfitting and the robustness-accuracy trade-off inherent in Fast Adversarial Training (FAT) by dynami…
This paper develops a supervised machine learning surrogate model, using a neural network, to predict the effective Lamé parameters of hyperelastic composites based on low-dimensional microstructural…
This paper introduces and analyzes a consistent estimator for the sub-Gaussian parameter ($\xi_*^2$), providing convergence rates and demonstrating its applicability in large-scale biological enrichme…
Mengnan Zhao, Lihe Zhang, Tianhang Zheng, Bo Wang +1 more
This paper reinterprets catastrophic overfitting (CO) in Fast Adversarial Training (FAT) as a weak backdoor mechanism, proposing backdoor-inspired strategies to mitigate this generalization failure.
This paper introduces BBOmix, an open-source benchmark for unsupervised representation learning on real-world biological data.
Shuqiang Wang, Wei Cao, Jiaqi Weng, Jialing Tao +3 more
The paper proposes a black-box attack using a hierarchical genetic algorithm to induce 'overthinking' in Large Reasoning Models, demonstrating that this vulnerability can cause significant resource ex…
The paper introduces Multifidelity Proper Orthogonal Decomposition (MFPOD), a method that significantly reduces the computational cost of dimension reduction by intelligently combining data from cheap…
Yubin Qu, Ying Zhang, Yanjun Zhang, Gelei Deng +3 more
The paper introduces OverEager-Gen, a new benchmark that measures 'overeager actions'—where coding agents perform unauthorized tasks beyond a benign request—and finds that removing explicit consent de…
Xinjue Wang, Xiuheng Wang, Yejun Zhang, Sergiy A. Vorobyov +2 more
The paper investigates whether using fine-grained, tensorized adapters (CP components) instead of standard LoRA ranks improves the accuracy-budget trade-off in PEFT, finding that while they fill budge…
The paper proposes a novel neural network compression technique that aggregates neurons with similar functional dynamics, achieving significant model size reduction while maintaining high accuracy.
The paper enhances the security of the PolyProtect biometric template protection method by proposing a key selection algorithm that significantly increases the difficulty of inverting protected face t…
The paper investigates applying Riemannian optimization techniques to low-rank matrix parameters for deep learning, but finds that the proposed methods do not conclusively outperform the AdamW baselin…