Papers similar to 2605.01129v2

~ similar to 2605.01129v2· 20 results

cs.CRRecentApr 8, 2026

Label Leakage Attacks in Machine Unlearning: A Parameter and Inversion-Based Approach

Weidong Zheng, Kongyang Chen, Yao Huang, Yuanwei Guo +1 more

This paper analyzes and proposes four novel attack methods—based on model parameters and model inversion—to demonstrate that existing machine unlearning techniques can inadvertently leak the categorie…

View →

cs.CVcs.CRcs.LGRecentApr 30, 2026

Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures

Ishrak Hamim Mahi, Siam Ferdous, Md Sakib Sadman Badhon, Nabid Hasan Omi +3 more

This paper proposes a modified SISA framework to achieve efficient class-level unlearning in CNNs, allowing the removal of specific data influence without full model retraining.

View →

cs.LGcs.AIcs.CRRecentJun 2, 2026

PURGE: Projected Unlearning via Retain-Guided Erasure

Vedant Jawandhia, Daksh Ahuja, Ghufran Alam Siddiqui, Prashant Trivedi +2 more

PURGE is a novel machine unlearning algorithm that leverages the duality between continual learning and unlearning to achieve high data retention while making the unlearned model indistinguishable fro…

View →

cs.LGcs.CRRecentMay 11, 2026

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

Ahmed Mehdi Inane, Vincent Quirion, Gintare Karolina Dziugaite, Ioannis Mitliagkas

The paper introduces Asymmetric Langevin Unlearning (ALU), a novel framework that uses public data to significantly reduce the utility loss typically associated with certified machine unlearning, enab…

View →

cs.LGcs.CRRecentApr 5, 2026

Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

Aobo Chen, Chenxu Zhao, Chenglin Miao, Mengdi Huai

The paper proposes a novel bi-level exact unlearning attack targeting Large Reasoning Models (LRMs) that forces incorrect final answers while generating misleading reasoning traces, highlighting new s…

View →

cs.LGcs.CRRecentMar 19, 2026

Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks

Jiahao Zhang, Yilong Wang, Suhang Wang

This paper introduces 'unlearning corruption attacks,' demonstrating that the performance degradation inherent in approximate graph unlearning can be exploited by an adversary to significantly reduce…

View →

cs.CLRecentMay 29, 2026

Divergence Decoding: Inference-Time Unlearning via Auxiliary Models

Humzah Merchant, Bradford Levy

Divergence Decoding (DD) is a novel, effective, and inexpensive method that uses auxiliary models to steer LLM logits during inference, enabling the removal of memorized sensitive data without signifi…

View →

cs.MAcs.CRRecentApr 1, 2026

Secure Forgetting: A Framework for Privacy-Driven Unlearning in Large Language Model (LLM)-Based Agents

Dayong Ye, Tainqing Zhu, Congcong Zhu, Feng He +4 more

The paper proposes a comprehensive framework for LLM-based agent unlearning, enabling agents to selectively forget specific knowledge (states, trajectories, or environments) while maintaining performa…

View →

cs.CRRecentApr 18, 2026

Privacy-Aware Machine Unlearning with SISA for Reinforcement Learning-Based Ransomware Detection

Jannatul Ferdous, Rafiqul Islam, Md Zahidul Islam

The paper proposes a privacy-aware machine unlearning framework using SISA training to efficiently remove the influence of specific training data from RL-based ransomware detectors with minimal perfor…

View →

cs.CRRecentMay 14, 2026

Privacy Auditing with Zero (0) Training Run

Tudor Cebere, Mathieu Even, Linus Bleistein, Aurélien Bellet

The paper introduces Zero-Run privacy auditing, a post-hoc framework that allows for practical differential privacy evaluation of large, deployed models without requiring retraining or controlled data…

View →

cs.CRcs.LGRecentMar 24, 2026

A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks

Najeeb Jebreel, David Sánchez, Josep Domingo-Ferrer

The paper proposes a new evaluation framework showing that, under realistic conditions, Membership Inference Attacks (MIAs) are weak privacy threats, suggesting that relying on them as a primary priva…

View →

cs.LGcs.CRRecentMar 30, 2026

ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Chihan Huang, Huaijin Wang, Shuai Wang

The paper introduces ReproMIA, a novel and efficient framework that uses model reprogramming to proactively amplify and detect latent privacy leakage for Membership Inference Attacks (MIAs), significa…

View →

cs.CRcs.CVRecentApr 14, 2026

CoLA: A Choice Leakage Attack Framework to Expose Privacy Risks in Subset Training

Qi Li, Cheng-Long Wang, Yinzhi Cao, Di Wang

This paper introduces CoLA, a framework demonstrating that subset training, while efficient, introduces new and potentially greater privacy risks by leaking information about both data membership and…

View →

cs.LGcs.AIcs.CRRecentMay 12, 2026

SoK: Unlearnability and Unlearning for Model Dememorization

Mengying Zhang, Derui Wang, Ruoxi Sun, Xiaoyu Xia +2 more

This paper provides the first integrated analysis of model dememorization, unifying unlearnability and unlearning methods, and offering theoretical guarantees on dememorization depth.

View →

cs.CRcs.LGRecentApr 5, 2026

Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement

Houzhe Wang, Xiaojie Zhu, Chi Chen

The paper proposes Jellyfish, a zero-shot federated unlearning scheme that effectively removes the influence of forgotten data from federated learning models while maintaining model utility and privac…

View →

cs.LGcs.AIcs.CRRecentApr 18, 2026

Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms

Bo Wang, Jia Ni, Mengnan Zhao, Zhan Qin +1 more

This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Sem…

View →

cs.CRcs.CLcs.LGRecentMay 22, 2026

What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Mingyuan Fan, Yu Liu, Fuyi Wang, Cen Chen

The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.

View →

cs.LGcs.CRRecentMay 9, 2026

Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

Weidong Zheng, Kongyang Chen, Yuanwei Guo, Yatie Xiao

This paper diagnoses a bias-dominated shortcut in class-level machine unlearning, where forgetting is achieved by suppressing classification head biases, and proposes bias-aware mechanisms to mitigate…

View →

cs.LGcs.CRRecentApr 6, 2026

Forgetting to Witness: Efficient Federated Unlearning and Its Visible Evaluation

Houzhe Wang, Xiaojie Zhu, Chi Chen

This paper introduces the first complete pipeline for federated unlearning, proposing an efficient unlearning approach and a novel visualization framework (Skyeye) to evaluate a model's forgetting cap…

View →

cs.LGcs.CLRecentMay 28, 2026

MAAT: Multi-phase Adapter-Aware Targeted Unlearning

Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain +2 more

The paper introduces 5WBENCH, a new benchmark for causal unlearning, and proposes MAAT, a novel three-phase framework that achieves high forgetting and high retention specifically on complex 'Why'-typ…

View →