~ similar to 2606.01719v1· 20 results
The paper proposes Fair Fine-tuning (FFt), a method that fine-tunes a model using an Equalized Odds constraint on a complementary distribution, and provides a formal theoretical bound linking this fai…
The paper demonstrates that current defenses against malicious fine-tuning of foundation models are insufficient because they only address fixed attacks, and introduces a unified adaptive attack that…
This paper analyzes the reliability of efficient membership inference attack (MIA) evaluation methods, demonstrating that standard aggregation techniques introduce biases that compromise accurate vuln…
Hoang Tran, Jorge Ramirez, Jiayi Wang, Alberto Bocchinfuso +2 more
The paper proposes a novel exponential mechanism using quadratic approximations to fine-tune machine learning models on sensitive data while providing strong differential privacy guarantees.
Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more
This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…
This paper proposes a density-aware attack that constructs triggers by placing poisoned samples in low-density regions of the clean data distribution, achieving high attack success rates even after st…
Zi Li, Tian Zhou, Wenze Li, Jingyu Hua +2 more
This paper introduces a novel supply-chain attack that uses model code backdoors to actively steal sensitive secrets from local LLM fine-tuning datasets, bypassing current privacy defenses.
The paper introduces Fine-Tuning Integrity (FTI), a security goal that uses Succinct Model Difference Proofs (SMDPs) to cryptographically prove that a fine-tuned model update adheres to specific struc…
The paper introduces the quotient semivalue mechanism to provide fair data attribution that is resistant to contributors manipulating their reported identities by splitting or duplicating data.
The paper introduces Asymmetric Langevin Unlearning (ALU), a novel framework that uses public data to significantly reduce the utility loss typically associated with certified machine unlearning, enab…
The paper introduces Zero-Run privacy auditing, a post-hoc framework that allows for practical differential privacy evaluation of large, deployed models without requiring retraining or controlled data…
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
Lisa Oakley, Sam Stites, Cameron Moy, Steven Holtzen +2 more
This paper proposes a Bayesian framework to enhance membership inference attacks against released statistics by incorporating prior knowledge about the population's attribute dependency structure, out…
This paper introduces CoLA, a framework demonstrating that subset training, while efficient, introduces new and potentially greater privacy risks by leaking information about both data membership and…
The paper introduces a sample-wise targeted adversarial attack that successfully misclassifies only specific, triggered inputs during test-time adaptation while maintaining the overall label distribut…
The paper proposes a new evaluation framework showing that, under realistic conditions, Membership Inference Attacks (MIAs) are weak privacy threats, suggesting that relying on them as a primary priva…
The paper introduces PACZero, a novel PAC-private fine-tuning mechanism that achieves usable utility for large language models while providing strong resistance against membership-inference attacks.
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…
This paper empirically evaluates the effectiveness of Differential Privacy (DP) against Membership Inference Attacks (MIAs) in Federated Learning, demonstrating that a stacking attack strategy can det…