Papers similar to 2605.30599

~ similar to 2605.30599· 20 results

cs.LGcs.AIcs.CRRecentJun 2, 2026

PURGE: Projected Unlearning via Retain-Guided Erasure

Vedant Jawandhia, Daksh Ahuja, Ghufran Alam Siddiqui, Prashant Trivedi +2 more

PURGE is a novel machine unlearning algorithm that leverages the duality between continual learning and unlearning to achieve high data retention while making the unlearned model indistinguishable fro…

View →

cs.LGcs.CLRecentMay 28, 2026

MAAT: Multi-phase Adapter-Aware Targeted Unlearning

Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain +2 more

The paper introduces 5WBENCH, a new benchmark for causal unlearning, and proposes MAAT, a novel three-phase framework that achieves high forgetting and high retention specifically on complex 'Why'-typ…

View →

cs.CLRecentMay 31, 2026

Med-HEAL: Analyzing and Mitigating Hallucinations in Medical LLMs with Hallucination-Aware In-Context Learning

Yiming Liao, Zeno Franco, Jose Eduardo Lizarraga Mazaba, Keke Chen

The paper introduces Med-HEAL, a comprehensive framework and dataset for systematically identifying and mitigating hallucinations in medical LLMs, demonstrating that a self-critique pipeline significa…

View →

cs.AIcs.CLcs.CYRecentMay 27, 2026

MIRA: A Bilingual Benchmark for Medical Information Response Audit

Mengyu Xu, Qiaoxin Yang, Qianqian Wang, Xiwei Dai +2 more

The paper introduces MIRA, a bilingual benchmark that reveals that LLMs tend to dilute or omit critical medical information when responding to prompts from users with low health literacy, a pattern te…

View →

cs.LGcs.CRRecentApr 5, 2026

Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

Aobo Chen, Chenxu Zhao, Chenglin Miao, Mengdi Huai

The paper proposes a novel bi-level exact unlearning attack targeting Large Reasoning Models (LRMs) that forces incorrect final answers while generating misleading reasoning traces, highlighting new s…

View →

cs.CRRecentMay 1, 2026

Revisiting Privacy Leakage in Machine Unlearning: Membership Inference Beyond the Forgotten Set

Jie Fu, Nima Naderloui, Da Zhong, Yuan Hong +1 more

This paper introduces TC-UMIA, a novel tri-class membership inference attack, demonstrating that machine unlearning can leak privacy risks to the retained data set, and evaluates defense mechanisms to…

View →

cs.AIRecentMay 27, 2026

Better Accuracies, Worse Reasoning: A Step-Level Audit of Medical Chain-of-Thought Distillation

Zhaoyang Jiang, Xuanqi Peng, Fei Teng, Zhizhong Fu +4 more

The paper demonstrates that while distilling large language models for medical QA can significantly improve final answer accuracy, this gain often comes at the cost of factual accuracy and detailed re…

View →

cs.LGcs.CRRecentMay 9, 2026

Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

Weidong Zheng, Kongyang Chen, Yuanwei Guo, Yatie Xiao

This paper diagnoses a bias-dominated shortcut in class-level machine unlearning, where forgetting is achieved by suppressing classification head biases, and proposes bias-aware mechanisms to mitigate…

View →

cs.CRcs.LGRecentApr 5, 2026

Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement

Houzhe Wang, Xiaojie Zhu, Chi Chen

The paper proposes Jellyfish, a zero-shot federated unlearning scheme that effectively removes the influence of forgotten data from federated learning models while maintaining model utility and privac…

View →

cs.CRcs.AIRecentMay 11, 2026

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

Peiru Yang, Haoran Zheng, Tong Ju, Shiting Wang +5 more

The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attac…

View →

cs.CLRecentMay 29, 2026

Divergence Decoding: Inference-Time Unlearning via Auxiliary Models

Humzah Merchant, Bradford Levy

Divergence Decoding (DD) is a novel, effective, and inexpensive method that uses auxiliary models to steer LLM logits during inference, enabling the removal of memorized sensitive data without signifi…

View →

cs.IRcs.CLRecentMay 29, 2026

Evaluating Factual Density in Multi-Source RAG: A Study in Medical AI Accuracy

Michael R. DeMarco

The paper introduces Factual Density (FD*), a novel retrieval signal that measures the proportion of verified facts, demonstrating that optimizing RAG retrieval based on this density significantly imp…

View →

cs.LGcs.CRRecentApr 6, 2026

Forgetting to Witness: Efficient Federated Unlearning and Its Visible Evaluation

Houzhe Wang, Xiaojie Zhu, Chi Chen

This paper introduces the first complete pipeline for federated unlearning, proposing an efficient unlearning approach and a novel visualization framework (Skyeye) to evaluate a model's forgetting cap…

View →

cs.AIRecentMay 28, 2026

Think Fast, Talk Smart: Partitioning Deterministic and Neural Computation for Structured Health Text Generation

Kai-Chen Cheng, Haejun Han, David Q. Sun

The paper proposes 'Think Fast, Talk Smart,' a pipeline that separates deterministic data analysis from LLM generation, showing that offloading recurring, structured tasks to code significantly improv…

View →

cs.LGcs.AIcs.CRRecentMay 12, 2026

SoK: Unlearnability and Unlearning for Model Dememorization

Mengying Zhang, Derui Wang, Ruoxi Sun, Xiaoyu Xia +2 more

This paper provides the first integrated analysis of model dememorization, unifying unlearnability and unlearning methods, and offering theoretical guarantees on dememorization depth.

View →

cs.CLcs.AIcs.LGRecentMay 27, 2026

MemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language Models

Hyeonjeong Ha, Jeonghwan Kim, Cheng Qian, Jiayu Liu +6 more

MemGuard introduces a type-aware memory framework to prevent heterogeneous memory contamination in long-term memory-augmented LLMs, significantly improving memory reliability and efficiency.

View →

cs.CVcs.AIRecentMay 29, 2026

SUPREME: A Multi-GPU Framework for Reproducible Image Unlearning Method Evaluation

Petros Andreou, Jamie Lanyon, Axel Finke, Georgina Cosma

SUPREME is an open-source, multi-GPU framework designed to efficiently and reproducibly evaluate machine unlearning methods for image classification by distributing computationally intensive tasks acr…

View →

cs.LGcs.AIRecentJun 1, 2026

How Hard Can It Be? Hardness-Aware Multi-Objective Unlearning

Jiangwei Chen, Xinyuan Niu, Rachael Hwee Ling Sim, Zhengyuan Liu +2 more

The paper proposes a novel, theoretically-grounded algorithm (HAMU) that addresses the challenge of machine unlearning by guaranteeing specified improvements in forget quality while minimizing retain…

View →

cs.LGcs.CRRecentMar 19, 2026

Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks

Jiahao Zhang, Yilong Wang, Suhang Wang

This paper introduces 'unlearning corruption attacks,' demonstrating that the performance degradation inherent in approximate graph unlearning can be exploited by an adversary to significantly reduce…

View →

cs.CLcs.AIRecentMay 28, 2026

Predicting Causal Effects from Natural Language Queries using Structured Representations

Giuliano Martinelli, Piriyakorn Piriyatamwong, Abelardo Carlos Martinez Lorenzo, Jasmin Baier +6 more

The paper introduces Query2Effect, a large-scale benchmark, and a two-step framework to predict causal effect sizes from natural language queries, showing that structured representation significantly…

View →