Papers similar to 2606.01719v1

~ similar to 2606.01719v1· 20 results

cs.LGcs.AIcs.CRRecentJun 1, 2026

Fair Finetuning Mitigates Distribution Inference Attacks

The paper proposes Fair Fine-tuning (FFt), a method that fine-tunes a model using an Equalized Odds constraint on a complementary distribution, and provides a formal theoretical bound linking this fai…

View →

cs.CRcs.AIcs.LGRecentMay 14, 2026

One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries

Itay Zloczower, Eyal Lenga, Gilad Gressel, Yisroel Mirsky

The paper demonstrates that current defenses against malicious fine-tuning of foundation models are insufficient because they only address fixed attacks, and introduces a unified adaptive attack that…

View →

cs.LGcs.CRRecentMay 25, 2026

On Reliability of Efficient Membership Inference Vulnerability Evaluation

Joonas Jälkö, Gauri Pradhan, Ossi Räisä, Antti Honkela

This paper analyzes the reliability of efficient membership inference attack (MIA) evaluation methods, demonstrating that standard aggregation techniques introduce biases that compromise accurate vuln…

View →

cs.LGcs.CRRecentMay 19, 2026

An exponential mechanism based on quadratic approximations for fine-tuning machine learning models with privacy guarantees

Hoang Tran, Jorge Ramirez, Jiayi Wang, Alberto Bocchinfuso +2 more

The paper proposes a novel exponential mechanism using quadratic approximations to fine-tune machine learning models on sensitive data while providing strong differential privacy guarantees.

View →

cs.CRcs.CLcs.DCRecentApr 27, 2026

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more

This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…

View →

cs.LGcs.CRRecentMay 27, 2026

Density-aware Sample-specific Attack

Qiyuan Wang, Yao Li, Raymond K. W. Wong

This paper proposes a density-aware attack that constructs triggers by placing poisoned samples in low-density regions of the clean data distribution, achieving high attack success rates even after st…

View →

cs.CRcs.AIRecentApr 30, 2026

Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

Zi Li, Tian Zhou, Wenze Li, Jingyu Hua +2 more

This paper introduces a novel supply-chain attack that uses model code backdoors to actively steal sensitive secrets from local LLM fine-tuning datasets, bypassing current privacy defenses.

View →

cs.CRcs.LGRecentApr 6, 2026

Fine-Tuning Integrity for Modern Neural Networks: Structured Drift Proofs via Norm, Rank, and Sparsity Certificates

Zhenhang Shang, Kani Chen

The paper introduces Fine-Tuning Integrity (FTI), a security goal that uses Succinct Model Difference Proofs (SMDPs) to cryptographically prove that a fine-tuned model update adheres to specific struc…

View →

cs.GTcs.CRcs.LGRecentMay 8, 2026

Quotient Semivalues for False-Name-Resistant Data Attribution

Florian A. D. Burnat, Brittany I. Davidson

The paper introduces the quotient semivalue mechanism to provide fair data attribution that is resistant to contributors manipulating their reported identities by splitting or duplicating data.

View →

cs.LGcs.CRRecentMay 11, 2026

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

Ahmed Mehdi Inane, Vincent Quirion, Gintare Karolina Dziugaite, Ioannis Mitliagkas

The paper introduces Asymmetric Langevin Unlearning (ALU), a novel framework that uses public data to significantly reduce the utility loss typically associated with certified machine unlearning, enab…

View →

cs.CRRecentMay 14, 2026

Privacy Auditing with Zero (0) Training Run

Tudor Cebere, Mathieu Even, Linus Bleistein, Aurélien Bellet

The paper introduces Zero-Run privacy auditing, a post-hoc framework that allows for practical differential privacy evaluation of large, deployed models without requiring retraining or controlled data…

View →

cs.CRcs.AIRecentApr 8, 2026

Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation

Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more

The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…

View →

cs.CRcs.PLRecentMay 28, 2026

A Bayesian Approach to Membership Inference for Statistical Release

Lisa Oakley, Sam Stites, Cameron Moy, Steven Holtzen +2 more

This paper proposes a Bayesian framework to enhance membership inference attacks against released statistics by incorporating prior knowledge about the population's attribute dependency structure, out…

View →

cs.CRcs.CVRecentApr 14, 2026

CoLA: A Choice Leakage Attack Framework to Expose Privacy Risks in Subset Training

Qi Li, Cheng-Long Wang, Yinzhi Cao, Di Wang

This paper introduces CoLA, a framework demonstrating that subset training, while efficient, introduces new and potentially greater privacy risks by leaking information about both data membership and…

View →

cs.LGcs.CRcs.CVRecentMay 22, 2026

Sample-wise Targeted Adversarial Attacks on Test-time Adaptation

Phuc Duc Nguyen, Quang Duc Nguyen

The paper introduces a sample-wise targeted adversarial attack that successfully misclassifies only specific, triggered inputs during test-time adaptation while maintaining the overall label distribut…

View →

cs.CRcs.LGRecentMar 24, 2026

A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks

Najeeb Jebreel, David Sánchez, Josep Domingo-Ferrer

The paper proposes a new evaluation framework showing that, under realistic conditions, Membership Inference Attacks (MIAs) are weak privacy threats, suggesting that relying on them as a primary priva…

View →

cs.LGcs.AIcs.CRRecentMay 7, 2026

PACZero: PAC-Private Fine-Tuning of Language Models via Sign Quantization

Murat Bilgehan Ertan, Xiaochen Zhu, Phuong Ha Nguyen, Marten van Dijk +1 more

The paper introduces PACZero, a novel PAC-private fine-tuning mechanism that achieves usable utility for large language models while providing strong resistance against membership-inference attacks.

View →

cs.CRcs.AIcs.CLRecentMar 25, 2026

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

Zhenyi Wang, Siyu Luan

The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.

View →

cs.CRcs.LGRecentMar 19, 2026

Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference

Pranay Anchuri, Matteo Campanelli, Paul Cesaretti, Rosario Gennaro +3 more

The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…

View →

cs.CRcs.LGRecentApr 14, 2026

Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

Gustavo de Carvalho Bertoli

This paper empirically evaluates the effectiveness of Differential Privacy (DP) against Membership Inference Attacks (MIAs) in Federated Learning, demonstrating that a stacking attack strategy can det…

View →