Papers similar to 2604.27456v1

~ similar to 2604.27456v1· 20 results

cs.CRcs.AIRecentApr 8, 2026

Private Seeds, Public LLMs: Realistic and Privacy-Preserving Synthetic Data Generation

The paper proposes RPSG, a method that uses private seeds and differential privacy to generate highly realistic and strongly privacy-preserving synthetic data replicas of private text for LLMs.

View →

cs.ITcs.CRTheoreticalRecentJul 20, 2026

Optimal Domain-Aware Privacy Mechanisms for Synthetic Data Generation

Sajani Vithana, Sangwon Jung, Haoyang Hu, Viveck R. Cadambe +2 more

This paper lays the theoretical foundation for incorporating public data into differential privacy mechanisms for improved synthetic data generation.

View →

cs.CRcs.LGRecentApr 14, 2026

Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

Gustavo de Carvalho Bertoli

This paper empirically evaluates the effectiveness of Differential Privacy (DP) against Membership Inference Attacks (MIAs) in Federated Learning, demonstrating that a stacking attack strategy can det…

View →

cs.LGcs.CRRecentApr 29, 2026

Fidelity, Diversity, and Privacy: A Multi-Dimensional LLM Evaluation for Clinical Data Augmentation

Guillermo Iglesias, Gema Bello-Orgaz, María Navas-Loro, Cristian Ramirez-Atencia +2 more

This paper evaluates multiple LLMs (DeepSeek-R1, OpenBioLLM-Llama3, Qwen 3.5) for generating privacy-safe, high-quality synthetic mental health reports, demonstrating their effectiveness in expanding…

View →

cs.CEcs.AIcs.CRRecentApr 16, 2026

Decoupling Identity from Utility: Privacy-by-Design Frameworks for Financial Ecosystems

Ifayoyinsola Ibikunle, Tyler Farnan, Senthil Kumar, Mayana Pereira

The paper proposes using Differentially Private (DP) synthetic data, specifically through tabular synthesis and DP-Seeded Agent-Based Modeling (ABM), to resolve the conflict between data utility and p…

View →

cs.CRRecentApr 17, 2026

DPDSyn: Improving Differentially Private Dataset Synthesis for Model Training by Downstream Task Guidance

Mingxuan Jia, Wen Huang, Weixin Zhao, Xingyi Wang +2 more

DPDSyn improves differentially private dataset synthesis by training a differentially private AI model on the original private data, which is then used to generate synthetic datasets that maintain hig…

View →

cs.LGcs.AIcs.DCEmpiricalRecentJul 21, 2026

SynPre-FL: Synthetic data-driven pretraining integrated Federated Learning training framework

Akarsh K Nair, Muhammad Arifur Rahman, Nicholas Shopland, Andy Burton +7 more

This paper proposes SynPre-FL, a framework that combines high-fidelity synthetic EHR generation with federated learning for robust clinical prediction.

View →

cs.IRTheoreticalRecentJun 12, 2026

Private Information Retrieval for Large-Scale DNA-Based Data Storage

Gökberk Erdoğan, Daniella Bar-Lev, Rawad Bitar, Antonia Wachter-Zeh +1 more

This paper proposes two approaches for applying two-server Private Information Retrieval (PIR) protocols to synthetic DNA-based data storage, addressing privacy, efficiency, and feasibility challenges…

View →

cs.CRcond-mat.otherq-bio.OTRecentMar 17, 2026

Synchronized DNA sources for unconditionally secure cryptography

Sandra Jaudou, Hélène Gasnier, Elias Boudjella, Marc Canève +10 more

The paper introduces a DNA-based cryptographic primitive that uses shared, sequenced DNA molecules to generate a common binary mask for One-Time Pad (OTP) encryption, achieving unconditional security…

View →

cs.CRcs.LGRecentMar 24, 2026

Privacy-Preserving EHR Data Transformation via Geometric Operators: A Human-AI Co-Design Technical Report

Maolin Wang, Beining Bao, Gan Yuan, Hongyu Chen +8 more

The paper proposes a novel data transformation framework that creates semantically rich, privacy-preserving numeric views of EHR data, enabling large-scale research while provably breaking patient lin…

View →

cs.CRcs.IRcs.LGRecentMay 31, 2026

Differentially Private Datastore Generation for Retrieval-Augmented Inference

Abdelrahman Abouelenein, Marwan Torki

The paper proposes a hashing-based framework using Differential Privacy to generate and release private datastores for retrieval-augmented AI systems, achieving strong privacy with minimal accuracy lo…

View →

cs.CRcs.AIcs.LGRecentMay 8, 2026

Seed Hijacking of LLM Sampling and Quantum Random Number Defense

Ziyang You, Xiaoke Yang, Zhanling Fan, Feng Guo +2 more

The paper introduces SeedHijack, a backdoor attack that manipulates the pseudorandom number generation process in LLMs to force specific token selections, and proposes a hardware quantum random number…

View →

cs.CRcs.AIcs.DCRecentApr 15, 2026

Secure and Privacy-Preserving Vertical Federated Learning

Shan Jin, Sai Rahul Rachuri, Yizhen Wang, Anderson C. A. Nascimento +1 more

The paper proposes an optimized, end-to-end privacy-preserving framework for vertical federated learning by distributing aggregation roles across multiple servers using secure multiparty computation a…

View →

cs.CRcs.AIcs.CVRecentMar 30, 2026

FedFG: Privacy-Preserving and Robust Federated Learning via Flow-Matching Generation

Ruiyang Wang, Rong Pan, Zhengan Yao

FedFG introduces a robust federated learning framework using flow-matching generation to simultaneously enhance client privacy and defend against sophisticated poisoning attacks.

View →

cs.CRRecentApr 29, 2026

PRAG: End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li, Minghui Xu, Huayi Qi, Wenxuan Yu +5 more

PRAG is an end-to-end privacy-preserving Retrieval-Augmented Generation (RAG) system that maintains high retrieval accuracy and scalability in cloud environments by encrypting both documents and queri…

View →

cs.LGcs.AIcs.CLRecentMay 14, 2026

MetaMoE: Diversity-Aware Proxy Selection for Privacy-Preserving Mixture-of-Experts Unification

Weisen Jiang, Shuhao Chen, Sinno Jialin Pan

MetaMoE introduces a privacy-preserving framework that unifies independently trained, domain-specialized experts into a single Mixture-of-Experts (MoE) model using diversity-aware proxy data.

View →

cs.LGcs.AIcs.CRRecentApr 22, 2026

Differentially Private Model Merging

Qichuan Yin, Manzil Zaheer, Tian Li

This paper proposes two post-processing techniques, random selection and linear combination, to construct a model that satisfies any desired differential privacy level without retraining, given a set…

View →

cs.CRcs.LGRecentApr 18, 2026

Towards Deep Encrypted Training: Low-Latency, Memory-Efficient, and High-Throughput Inference for Privacy-Preserving Neural Networks

Nges Brian Njungle, Eric Jahns, Michel A. Kinsy

This paper develops optimized algorithms and a pipeline architecture for high-throughput, memory-efficient batch processing of encrypted neural network inference, significantly improving performance o…

View →

cs.CRcs.AIRecentMay 4, 2026

Privacy Preserving Machine Learning Workflow: from Anonymization to Personalized Differential Privacy Budgets in Federated Learning

Judith Sáinz-Pardo Díaz, Álvaro López García

This paper proposes a comprehensive federated learning workflow that enhances privacy and robustness by integrating personalized differential privacy budgets and client drift detection, achieving bett…

View →

cs.CRRecentApr 26, 2026

Spore: Efficient and Training-Free Privacy Extraction Attack on LLMs via Inference-Time Hybrid Probing

Yu Cui, Ruiqing Yue, Hang Fu, Sicheng Pan +5 more

The paper introduces extsc{Spore}, a novel, training-free, and highly efficient privacy extraction attack that targets sensitive information stored in the memory of LLM agents during inference, outpe…

View →