Papers similar to 2606.01292

~ similar to 2606.01292· 17 results

math.NAcs.LGRecentJun 1, 2026

Spectral Audit of In-Context Operator Networks

Zhiwei Gao, Liu Yang, George Em Karniadakis

The paper introduces a Jacobian-based spectral audit to evaluate neural operators, demonstrating that standard prediction error metrics fail to capture crucial local dynamical structures and operator…

View →

cs.CVcs.AIcs.LGRecentMay 30, 2026

DASH: Dual-Branch Score Distillation for Guidance-Calibrated Compact Diffusion Models

Abdullah Al Shafi, Kazi Saeed Alam, Sk Imran Hossain, Engelbert Mephu Nguifo

DASH introduces a dual-branch distillation framework to effectively compress class-conditional diffusion models by independently supervising both score branches, significantly preserving guidance fide…

View →

cs.LGcs.AImath.OCRecentMay 29, 2026

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Shervin Khalafi, Alejandro Ribeiro, Dongsheng Ding

The paper proposes a unified, constrained optimization framework using KL divergence and likelihood constraints to achieve effective and principled unlearning in diffusion models.

View →

cs.LGcs.CLRecentMay 31, 2026

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher

Arda Uzunoglu, Alvin Zhang, Daniel Khashabi

The paper introduces trust functions to filter weak supervision labels, enabling near-lossless weak-to-strong generalization by selectively training a strong student using only the most reliable weak…

View →

cs.LGRecentJun 4, 2026

TailLoR: Protecting Principal Components in Parameter-Efficient Continual Learning

Marius Dragoi, Ioana Pintilie, Alexandra Dragomir, Antonio Barbalau +1 more

TailLoR is a new parameter-efficient finetuning method that uses the singular bases of pre-trained weights to learn low-rank updates, specifically penalizing updates along dominant directions to impro…

View →

cs.LGcs.AIcs.IRRecentMay 28, 2026

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

Shali Jiang, Hua Zheng, Boyang Liu, Laming Chen +39 more

LoopFM proposes a novel framework to significantly improve knowledge distillation for recommendation systems by structuring the rich intermediate embeddings of large foundation models as input feature…

View →

cs.LGRecentJun 1, 2026

Why Are DMD Students Lazy? Understanding the Copying Behavior in Few-Step Distillation

Shucheng Li, Iolo Jones, Alexander Tong, Michael M. Bronstein

This paper investigates the phenomenon of 'copying' in Distribution Matching Distillation (DMD), finding that high-dimensional distillation causes student models to spontaneously reproduce the teacher…

View →

cs.CRcs.AIcs.LGRecentMay 20, 2026

Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs

Leitao Yuan, Qinghua Mao, Daizong Liu, Kun Wang +4 more

The paper proposes FRA-Attack, a frequency-domain regularization method, to significantly improve the transferability of adversarial attacks against closed-source Multimodal Large Language Models (MLL…

View →

cs.LGcs.AIRecentMay 27, 2026

Learning Theory of the SVRG: Generalization and Convergence Analysis

Yunwen Lei, Zimeng Wang, Xiaoming Yuan

This paper provides the first non-vacuous generalization analysis for the Stochastic Variance Reduced Gradient (SVRG) method by establishing sharp, data-dependent algorithmic stability bounds, thereby…

View →

cs.LGcs.AIstat.MLRecentMay 28, 2026

On the Optimizer Dependence of Neural Scaling Laws

Vansh Ramani, Shourya Vir Jain

The scaling exponent in neural scaling laws is not fixed but systematically depends on the optimizer used, with preconditioned optimizers generally yielding steeper scaling.

View →

cs.CRRecentMay 26, 2026

GradSentry: Gradient Spectral Entropy for Backdoor Sample Filtering in Large Language Model Fine-Tuning

Haodong Zhao, Tianyi Xu, Tianhang Zhao, Zhuosheng Zhang +1 more

GradSentry introduces a novel backdoor sample filtering method that uses the spectral entropy of individual sample gradients to detect poisoned data during LLM fine-tuning, proving effective even at h…

View →

cs.LGcs.AIcs.CVRecentMay 28, 2026

TRACER: Persistent Regularization for Robust Multimodal Finetuning

Hesam Asadollahzadeh, Feng Liu, Christopher Leckie, Sarah M. Erfani

The paper introduces TRACER, a novel regularization framework that uses Weighted Moving Average (WMA) distillation to robustly finetune multimodal models, mitigating catastrophic forgetting and improv…

View →

cs.CLcs.CRcs.LGRecentApr 3, 2026

Learning the Signature of Memorization in Autoregressive Language Models

David Ilić, Kostadin Cvejoski, David Stanojević, Evgeny Grigorenko

The paper introduces a novel, transferable learned attack (LT-MIA) that detects a universal 'signature of memorization' in language models, achieving high accuracy across diverse model architectures (…

View →

cs.LGcs.AIRecentMay 28, 2026

Do Physics Foundation Models Learn Generalizable Physics? A Bias-Aware Benchmark Across Physical Regimes and Distribution Shifts

Mengdi Chu, Yang Liu, Ayan Biswas, Han-Wei Shen

The paper introduces a comprehensive benchmark to test if physics foundation models learn generalizable dynamics, finding that their performance is highly conditional and not universally general.

View →

cs.LGcs.AIRecentMay 28, 2026

Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment

Mustafa Uzun, Mete Erdogan, Cengiz Pehlevan, Alper T. Erdogan

The paper introduces Score Broadcast and Decorrelation (SBD), a general theoretical framework that unifies broadcast-based credit assignment across various differentiable loss functions by leveraging…

View →

cs.LGcs.AIcs.CVRecentMay 30, 2026

On the Difficulty of Learning a Meta-network for Training Data Selection

Zilin Du, Junqi Zhao, Boyang Albert Li

This paper analyzes the poor performance of Meta-learning for Training-data Selection (MTS) and proposes that increasing the batch size and incorporating informative features can significantly improve…

View →

cs.LGcs.AIcs.CVRecentMay 28, 2026

How Much Is a Dataset Worth? Scaling Laws, the Vendi Score, and Matrix Spectral Functions

Jeff A. Bilmes, Gantavya Bhatt, Arnav M. Das

The paper introduces and analyzes several novel data appraisal metrics, including the Vendi Score and matrix spectral functions, demonstrating that efficient optimization techniques make these metrics…

View →