Papers similar to 2605.31293

~ similar to 2605.31293· 19 results

cs.CRcs.CLcs.DCRecentApr 27, 2026

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more

This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…

View →

cs.CRRecentMay 1, 2026

Revisiting Privacy Leakage in Machine Unlearning: Membership Inference Beyond the Forgotten Set

Jie Fu, Nima Naderloui, Da Zhong, Yuan Hong +1 more

This paper introduces TC-UMIA, a novel tri-class membership inference attack, demonstrating that machine unlearning can leak privacy risks to the retained data set, and evaluates defense mechanisms to…

View →

cs.AIRecentMay 28, 2026

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Qi Liu, Mingdi Sun, Yongyi He, Zhi Zheng +4 more

The paper proposes EKSFT, a selective fine-tuning method that masks high-entropy or high-KL divergence tokens during Supervised Fine-Tuning (SFT) to prevent distribution shift and improve subsequent R…

View →

cs.LGcs.AImath.OCRecentMay 29, 2026

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Shervin Khalafi, Alejandro Ribeiro, Dongsheng Ding

The paper proposes a unified, constrained optimization framework using KL divergence and likelihood constraints to achieve effective and principled unlearning in diffusion models.

View →

cs.LGcs.AIcs.CRRecentApr 18, 2026

Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms

Bo Wang, Jia Ni, Mengnan Zhao, Zhan Qin +1 more

This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Sem…

View →

cs.CLcs.CRcs.LGRecentApr 3, 2026

Learning the Signature of Memorization in Autoregressive Language Models

David Ilić, Kostadin Cvejoski, David Stanojević, Evgeny Grigorenko

The paper introduces a novel, transferable learned attack (LT-MIA) that detects a universal 'signature of memorization' in language models, achieving high accuracy across diverse model architectures (…

View →

cs.LGcs.CRRecentApr 5, 2026

Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

Aobo Chen, Chenxu Zhao, Chenglin Miao, Mengdi Huai

The paper proposes a novel bi-level exact unlearning attack targeting Large Reasoning Models (LRMs) that forces incorrect final answers while generating misleading reasoning traces, highlighting new s…

View →

cs.LGcs.AIRecentMay 29, 2026

De-attribute to Forget for LLM Unlearning

Xinyang Lu, Jiabao Pan, Rachael Hwee Ling Sim, See-Kiong Ng +2 more

The paper proposes DareU, a novel LLM unlearning framework that optimizes unlearning by zeroing out data attribution scores instead of maximizing prediction loss, achieving effective unlearning while…

View →

cs.LGcs.AIcs.CRRecentJun 2, 2026

PURGE: Projected Unlearning via Retain-Guided Erasure

Vedant Jawandhia, Daksh Ahuja, Ghufran Alam Siddiqui, Prashant Trivedi +2 more

PURGE is a novel machine unlearning algorithm that leverages the duality between continual learning and unlearning to achieve high data retention while making the unlearned model indistinguishable fro…

View →

cs.LGcs.AIcs.CRRecentMay 12, 2026

SoK: Unlearnability and Unlearning for Model Dememorization

Mengying Zhang, Derui Wang, Ruoxi Sun, Xiaoyu Xia +2 more

This paper provides the first integrated analysis of model dememorization, unifying unlearnability and unlearning methods, and offering theoretical guarantees on dememorization depth.

View →

cs.LGcs.CRRecentMay 11, 2026

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

Ahmed Mehdi Inane, Vincent Quirion, Gintare Karolina Dziugaite, Ioannis Mitliagkas

The paper introduces Asymmetric Langevin Unlearning (ALU), a novel framework that uses public data to significantly reduce the utility loss typically associated with certified machine unlearning, enab…

View →

cs.CLcs.AIcs.LGRecentMay 29, 2026

Not All Synthetic Data Is Yours to Learn From

Sina Alemohammad, Li Chen, Richard G. Baraniuk, Zhangyang Wang

Weak self-training on synthetic data can amplify a language model's existing capabilities, but this effect is strictly dependent on the compatibility between the source and student models, not on the…

View →

cs.AIRecentMay 28, 2026

Make LLM Learn to Synthesize from Streaming Experiences through Feedback

Zhenlin Hu, Yan Wang, Zhen Bi, Zihao Xue +6 more

The paper introduces StreamSynth, a sequential setting for synthetic data generation, and proposes SynLearner, a framework that enables LLMs to improve synthesis performance by accumulating and transf…

View →

cs.DCcs.AIRecentJun 1, 2026

Not All Errors Are Equal: A Systematic Study of Error Propagation in Large Language Model Inference

Yafan Huang, Sheng Di, Guanpeng Li

This paper systematically studies how soft errors propagate during Large Language Model (LLM) inference using a novel fault-injection framework, providing critical insights and mitigation strategies f…

View →

cs.MAcs.CRRecentApr 1, 2026

Secure Forgetting: A Framework for Privacy-Driven Unlearning in Large Language Model (LLM)-Based Agents

Dayong Ye, Tainqing Zhu, Congcong Zhu, Feng He +4 more

The paper proposes a comprehensive framework for LLM-based agent unlearning, enabling agents to selectively forget specific knowledge (states, trajectories, or environments) while maintaining performa…

View →

cs.CLcs.AIRecentMay 31, 2026

DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs

Longxuan Yu, Yunshu Wu, Yu Fu, Siheng Xiong +4 more

The paper introduces DSL-LLaDA, a method that lightly adapts a pre-trained masked diffusion language model to perform continuous denoising in embedding space, significantly improving text generation q…

View →

cs.LGcs.AIcs.CRRecentMay 6, 2026

Information Theoretic Adversarial Training of Large Language Models

Yiwei Zhang, Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia +2 more

The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial exam…

View →

cs.CRcs.AIRecentApr 2, 2026

Combating Data Laundering in LLM Training

Muxing Li, Zesheng Ye, Sharon Li, Feng Liu

The paper introduces Synthesis Data Reversion (SDR), a method that infers the data laundering transformation used in LLM training and synthesizes queries to restore the detection signals lost when pro…

View →

cs.CRcs.CLcs.LGRecentMay 22, 2026

What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Mingyuan Fan, Yu Liu, Fuyi Wang, Cen Chen

The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.

View →