~ similar to 2605.30808· 20 results
The paper introduces DPPrefSyn, a novel algorithm that generates differentially private synthetic preference data, enabling privacy-preserving alignment of large language models.
Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more
The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…
Mingxuan Jia, Wen Huang, Weixin Zhao, Xingyi Wang +2 more
DPDSyn improves differentially private dataset synthesis by training a differentially private AI model on the original private data, which is then used to generate synthetic datasets that maintain hig…
The paper proposes FedVPA-GP, a federated learning framework that uses a Gumbel-Softmax prior and orthogonal loss to personalize LLM alignment by disentangling conflicting user preferences while maint…
The paper introduces Drifting Preference Optimization (DrPO), an efficient online method for preference finetuning one-step text-to-image generators that avoids complex gradient calculations and model…
The paper proposes RPSG, a method that uses private seeds and differential privacy to generate highly realistic and strongly privacy-preserving synthetic data replicas of private text for LLMs.
The paper quantifies the cost of privacy in language identification and generation using differentially private (DP) methods, finding that the cost is surprisingly mild, particularly absent under appr…
The paper introduces RAG-Pref, a novel, training-free Retrieval Augmented Generation (RAG) method for preference alignment that significantly improves LLM refusal guardrails against agentic attacks wi…
Xiwen Chen, Wenhui Zhu, Jingjing Wang, Peijie Qiu +12 more
S-SPPO introduces a dual-space semantic calibration framework to stabilize Self-Play Preference Optimization (SPPO), preventing policy degeneration when preference oracles assign overly confident wins…
The paper proposes DP-LAC, a novel lightweight adaptive clipping technique for differentially private federated fine-tuning, which efficiently estimates and adapts the clipping threshold without consu…
Erchi Wang, Pengrun Huang, Eli Chien, Om Thakkar +3 more
The paper introduces DPrivBench, a new benchmark to test whether large language models (LLMs) can automate the complex reasoning required to verify differential privacy guarantees for algorithms.
Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yuxin Chen +6 more
The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-sp…
Peihan Liu, Lucas Rosenblatt, Weiwei Kong, Natalia Ponomareva +6 more
The paper introduces ContinuousBench, a dynamic benchmark designed to rigorously test if differentially private (DP) synthetic text can genuinely transfer new knowledge and capabilities from sensitive…
Peihan Liu, Lucas Rosenblatt, Weiwei Kong, Natalia Ponomareva +6 more
The paper introduces ContinuousBench, a novel benchmark designed to rigorously test if differentially private (DP) synthetic text can genuinely transfer new knowledge, finding that state-of-the-art DP…
The paper proposes a novel, explicitly exploratory iterative Nash Learning from Human Feedback (NLHF) algorithm that achieves strong regret bounds for optimizing LLMs based on complex, non-scalar huma…
Peihua Mai, Xuanrong Gao, Youlong Ding, Xianglong Du +2 more
SharedRequest introduces a model-agnostic framework that enhances LLM privacy and efficiency by batching and mixing prompts with noisy variants, achieving high utility and significant cost reduction.
This paper proposes two post-processing techniques, random selection and linear combination, to construct a model that satisfies any desired differential privacy level without retraining, given a set…
This paper develops a differential privacy framework to analyze and optimize privacy leakage from AI agent responses that utilize sensitive enterprise data, focusing on deriving optimal generation par…
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
Kassem Fawaz, Ren Yi, Octavian Suciu, Rishabh Khandelwal +3 more
The paper introduces Narriva, a method that generates text-based synthetic privacy personas grounded in past user behavior to accurately and efficiently simulate individual and population-level privac…