~ similar to 2604.07486v2· 20 results
The paper proposes a novel framework that enables multiple institutions to jointly train a synthetic genomic data generator without revealing their raw data, thereby facilitating large-scale, privacy-…
Peihan Liu, Lucas Rosenblatt, Weiwei Kong, Natalia Ponomareva +6 more
The paper introduces ContinuousBench, a dynamic benchmark designed to rigorously test if differentially private (DP) synthetic text can genuinely transfer new knowledge and capabilities from sensitive…
Peihan Liu, Lucas Rosenblatt, Weiwei Kong, Natalia Ponomareva +6 more
The paper introduces ContinuousBench, a novel benchmark designed to rigorously test if differentially private (DP) synthetic text can genuinely transfer new knowledge, finding that state-of-the-art DP…
Erchi Wang, Pengrun Huang, Eli Chien, Om Thakkar +3 more
The paper introduces DPrivBench, a new benchmark to test whether large language models (LLMs) can automate the complex reasoning required to verify differential privacy guarantees for algorithms.
This paper develops a differential privacy framework to analyze and optimize privacy leakage from AI agent responses that utilize sensitive enterprise data, focusing on deriving optimal generation par…
The paper introduces a novel, efficient mechanism based on permute-and-flip for applying differential privacy to symbolic state trajectories, significantly reducing the computational overhead compared…
Mingxuan Jia, Wen Huang, Weixin Zhao, Xingyi Wang +2 more
DPDSyn improves differentially private dataset synthesis by training a differentially private AI model on the original private data, which is then used to generate synthetic datasets that maintain hig…
The paper evaluates LLM-based simulators for generating differentially private synthetic data, finding that while they show promise for utility, they suffer from significant distribution drift due to…
The paper quantifies the cost of privacy in language identification and generation using differentially private (DP) methods, finding that the cost is surprisingly mild, particularly absent under appr…
This paper evaluates multiple LLMs (DeepSeek-R1, OpenBioLLM-Llama3, Qwen 3.5) for generating privacy-safe, high-quality synthetic mental health reports, demonstrating their effectiveness in expanding…
Zhijun Li, Minghui Xu, Huayi Qi, Wenxuan Yu +5 more
PRAG is an end-to-end privacy-preserving Retrieval-Augmented Generation (RAG) system that maintains high retrieval accuracy and scalability in cloud environments by encrypting both documents and queri…
Peihua Mai, Xuanrong Gao, Youlong Ding, Xianglong Du +2 more
SharedRequest introduces a model-agnostic framework that enhances LLM privacy and efficiency by batching and mixing prompts with noisy variants, achieving high utility and significant cost reduction.
The paper introduces DPPrefSyn, a novel algorithm that generates differentially private synthetic preference data, enabling privacy-preserving alignment of large language models.
The paper introduces DPPrefSyn, a novel algorithm that generates differentially private synthetic preference data, enabling privacy-preserving alignment of large language models.
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more
The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…
The paper proposes using Differentially Private (DP) synthetic data, specifically through tabular synthesis and DP-Seeded Agent-Based Modeling (ABM), to resolve the conflict between data utility and p…
Kassem Fawaz, Ren Yi, Octavian Suciu, Rishabh Khandelwal +3 more
The paper introduces Narriva, a method that generates text-based synthetic privacy personas grounded in past user behavior to accurately and efficiently simulate individual and population-level privac…
Weijun Li, Arnaud Grivet Sébert, Qiongkai Xu, Annabelle McIver +1 more
The paper proposes an empirical calibration method, TeDA, to provide a more comparable and interpretable assessment of privacy loss for text rewriting mechanisms under Local Differential Privacy (LDP)…
The paper proposes a hashing-based framework using Differential Privacy to generate and release private datastores for retrieval-augmented AI systems, achieving strong privacy with minimal accuracy lo…