Papers similar to 2606.00241

~ similar to 2606.00241· 19 results

cs.LGcs.AIcs.ITRecentJun 1, 2026

Estimating Mutual Information between Time Series and Temporal Event Sequences Across Diverse Analysis Tasks

Haoji Hu, Huaqing Mao, Yijun Lin, Xiaowei Jia +3 more

The paper proposes a novel nonparametric mutual information estimator to robustly quantify dependence between heterogeneous temporal data, specifically continuous time series and discrete event sequen…

View →

cs.LGcs.CLRecentMay 28, 2026

MAAT: Multi-phase Adapter-Aware Targeted Unlearning

Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain +2 more

The paper introduces 5WBENCH, a new benchmark for causal unlearning, and proposes MAAT, a novel three-phase framework that achieves high forgetting and high retention specifically on complex 'Why'-typ…

View →

cs.LGcs.AIRecentMay 28, 2026

Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment

Mustafa Uzun, Mete Erdogan, Cengiz Pehlevan, Alper T. Erdogan

The paper introduces Score Broadcast and Decorrelation (SBD), a general theoretical framework that unifies broadcast-based credit assignment across various differentiable loss functions by leveraging…

View →

cs.LGcs.AIRecentMay 27, 2026

On the Learnability of Test-Time Adaptation: A Recovery Complexity Perspective

Zhi Zhou, Ming Yang, Shi-Yu Tian, Kun-Yang Yu +2 more

The paper establishes the first theoretical framework for analyzing the learnability of Test-Time Adaptation (TTA) under non-stationary data streams by introducing Recovery Complexity, which quantifie…

View →

cs.IRRecentJun 3, 2026

Dual-Stream MLP is All You Need for CTR Prediction

Kesha Ou, Zhen Tian, Wayne Xin Zhao, Long Zhang +2 more

This paper proposes a novel framework, DS-MLP, for click-through rate prediction in online advertising and recommendation systems.

View →

cs.CLcs.AIRecentMay 28, 2026

Predicting Causal Effects from Natural Language Queries using Structured Representations

Giuliano Martinelli, Piriyakorn Piriyatamwong, Abelardo Carlos Martinez Lorenzo, Jasmin Baier +6 more

The paper introduces Query2Effect, a large-scale benchmark, and a two-step framework to predict causal effect sizes from natural language queries, showing that structured representation significantly…

View →

cs.LGcs.AIRecentMay 27, 2026

Locality-Aware Redundancy Pruning for LLM Depth Compression

Vincent-Daniel Yun, Youngrae Kim, Woosang Lim, YoungJin Heo +2 more

The paper proposes Locality-Aware Redundancy Pruning (LoRP), a training-free method that prunes LLM layers by exploiting localized inter-layer redundancy, leading to improved efficiency while maintain…

View →

cs.CLcs.AIRecentJun 1, 2026

From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression

Elia Cunegatti, Marcus Vukojevic, Erik Nielsen, Giovanni Iacca

The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…

View →

cs.LGcs.AIcs.CVRecentMay 30, 2026

On the Difficulty of Learning a Meta-network for Training Data Selection

Zilin Du, Junqi Zhao, Boyang Albert Li

This paper analyzes the poor performance of Meta-learning for Training-data Selection (MTS) and proposes that increasing the batch size and incorporating informative features can significantly improve…

View →

cs.LGmath.STstat.MERecentJun 1, 2026

Network Learning with Semi-relaxed Gromov-Wasserstein

Charles Dufour, Ulysse Naepels, Leonardo V. Santoro

The paper proposes a semi-relaxed Gromov-Wasserstein objective to estimate the latent connectivity structure of large-scale networks, achieving statistically consistent and efficient recovery of the u…

View →

cs.LGcs.CLRecentJun 3, 2026

STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

Rishit Dagli, Abir Harrasse, Luke Zhang, Florent Draye +3 more

This paper proposes a new framework called STRIDE for training data attribution in Large Language Models.

View →

cs.CLcs.CRcs.LGRecentApr 3, 2026

Learning the Signature of Memorization in Autoregressive Language Models

David Ilić, Kostadin Cvejoski, David Stanojević, Evgeny Grigorenko

The paper introduces a novel, transferable learned attack (LT-MIA) that detects a universal 'signature of memorization' in language models, achieving high accuracy across diverse model architectures (…

View →

cs.AIRecentMay 28, 2026

NaRA: Noise-Aware LoRA for Parameter-Efficient Fine-Tuning of Diffusion LLMs

Shuaidi Wang, Zhan Zhuang, Ruping Huang, Yu Zhang

The paper introduces NaRA, a noise-aware LoRA technique that dynamically adapts fine-tuning parameters based on the noise level during diffusion, significantly improving the performance of Diffusion L…

View →

cs.LGcs.CLRecentMay 31, 2026

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher

Arda Uzunoglu, Alvin Zhang, Daniel Khashabi

The paper introduces trust functions to filter weak supervision labels, enabling near-lossless weak-to-strong generalization by selectively training a strong student using only the most reliable weak…

View →

cs.AIRecentMay 28, 2026

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Qi Liu, Mingdi Sun, Yongyi He, Zhi Zheng +4 more

The paper proposes EKSFT, a selective fine-tuning method that masks high-entropy or high-KL divergence tokens during Supervised Fine-Tuning (SFT) to prevent distribution shift and improve subsequent R…

View →

cs.CVcs.AIcs.LGRecentMay 30, 2026

DASH: Dual-Branch Score Distillation for Guidance-Calibrated Compact Diffusion Models

Abdullah Al Shafi, Kazi Saeed Alam, Sk Imran Hossain, Engelbert Mephu Nguifo

DASH introduces a dual-branch distillation framework to effectively compress class-conditional diffusion models by independently supervising both score branches, significantly preserving guidance fide…

View →

cs.IRcs.AIRecentMay 27, 2026

Fine-Tuned LLM as a Complementary Predictor Improving Ads System

Hui Yang, Daiwei He, Kevin Jiang, Taejin Park +19 more

The paper introduces a novel paradigm where a fine-tuned LLM acts as an ancillary predictor to forecast likely advertisers, significantly improving ad recommendation systems by augmenting candidate ge…

View →

cs.LGcs.AIstat.MLRecentMay 28, 2026

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

Eugène Berta, David Holzmüller, Francis Bach, Michael I. Jordan

The paper introduces CalArena, a large-scale, standardized benchmark covering nearly 2000 experiments to comprehensively evaluate post-hoc calibration methods, finding that smooth calibration function…

View →

cs.CLcs.IRRecentJun 3, 2026

Caliper: Probing Lexical Anchors versus Causal Structure in LLMs

Zhenyu Yu, Shuigeng Zhou

This paper evaluates the causal reasoning abilities of large language models and finds that they rely heavily on lexical pattern matching rather than structural reasoning.

View →