~ similar to 2604.24468v1· 20 results
This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…
Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more
The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more
This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
SafeLM is a comprehensive framework that jointly addresses privacy, security, misinformation, and adversarial robustness in federated LLMs, achieving high safety performance while significantly reduci…
Guanlong Wu, Zhaohan li, Yao Zhang, Zheng Zhang +3 more
CachePrune introduces a privacy-aware, fine-grained KV cache sharing mechanism that allows LLM inference systems to safely reuse cache entries across users' requests, significantly improving efficienc…
Rui Zhang, Hongwei Li, Yun Shen, Xinyue Shen +5 more
The paper investigates how various fine-tuning methods can be used both to intentionally misalign and subsequently realign large language models (LLMs), revealing distinct strengths for attack and def…
Divergence Decoding (DD) is a novel, effective, and inexpensive method that uses auxiliary models to steer LLM logits during inference, enabling the removal of memorized sensitive data without signifi…
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
This paper investigates the privacy risk of reconstructing Personally Identifiable Information (PII) from Large Language Models (LLMs) that have undergone Supervised Finetuning (SFT), proposing a nove…
Hanlin Cai, Kai Li, Houtianfu Wang, Haofan Dong +3 more
The paper proposes an Augmented Model maniPulation (AugMP) strategy, utilizing graph representation learning, to effectively and stealthily manipulate federated fine-tuning of LLMs, significantly degr…
This paper demonstrates that LLM cascade systems, designed for efficiency, are vulnerable to targeted adversarial attacks that simultaneously degrade both performance and cost-efficiency.
The paper benchmarks local, offline LLMs for confidential translation workflows, demonstrating that while they are viable for privacy-sensitive use, they generally lag behind top commercial NMT system…
The paper systematically evaluates eight privacy-preserving techniques for LLM requests, finding that a combination of local inference, redaction, and semantic rephrasing provides the best overall pro…
Peihua Mai, Xuanrong Gao, Youlong Ding, Xianglong Du +2 more
SharedRequest introduces a model-agnostic framework that enhances LLM privacy and efficiency by batching and mixing prompts with noisy variants, achieving high utility and significant cost reduction.
The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…
The paper introduces LLM-CEG, an extended framework that uses membership inference attack success rates and model perplexity to systematically audit and optimize the privacy-utility trade-off when fin…
Zi Li, Tian Zhou, Wenze Li, Jingyu Hua +2 more
This paper introduces a novel supply-chain attack that uses model code backdoors to actively steal sensitive secrets from local LLM fine-tuning datasets, bypassing current privacy defenses.
This paper demonstrates that existing open-weight LLM safeguards are vulnerable to simple, non-gradient-based attacks like abliteration and prefilling, significantly increasing the attack success rate…