~ similar to 2604.21421v1· 20 results
The paper introduces HERALD, a token-level cryptographic redaction framework that encrypts only sensitive tokens in clinical text, enabling privacy-preserving LLM deployment without significant loss o…
This paper evaluates multiple LLMs (DeepSeek-R1, OpenBioLLM-Llama3, Qwen 3.5) for generating privacy-safe, high-quality synthetic mental health reports, demonstrating their effectiveness in expanding…
This paper introduces KliniskVestBERT, a suite of BERT models specialized by pre-training on a large, diverse corpus of real-world Norwegian clinical texts, demonstrating superior performance for clin…
Shashie Dilhara Batan Arachchige, Hassan Jameel Asghar, Benjamin Zi Hao Zhao, Dinusha Vatsalan +1 more
The paper proposes a character-level differential privacy mechanism to sanitize sensitive user prompts for LLMs, achieving high privacy for PII while maintaining utility for non-sensitive context.
Weijun Li, Arnaud Grivet Sébert, Qiongkai Xu, Annabelle McIver +1 more
The paper proposes an empirical calibration method, TeDA, to provide a more comparable and interpretable assessment of privacy loss for text rewriting mechanisms under Local Differential Privacy (LDP)…
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
Erchi Wang, Pengrun Huang, Eli Chien, Om Thakkar +3 more
The paper introduces DPrivBench, a new benchmark to test whether large language models (LLMs) can automate the complex reasoning required to verify differential privacy guarantees for algorithms.
The paper introduces LLM-CEG, an extended framework that uses membership inference attack success rates and model perplexity to systematically audit and optimize the privacy-utility trade-off when fin…
The paper evaluates the semantic stability of clinical LLMs to linguistic variations, finding that domain specialization does not guarantee consistent robustness improvements.
The paper introduces Reverse Probing, a novel framework that quantifies token-level uncertainty in large language models (LLMs) specifically for clinical text by analyzing internal model activations,…
Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more
This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…
The paper benchmarks local, offline LLMs for confidential translation workflows, demonstrating that while they are viable for privacy-sensitive use, they generally lag behind top commercial NMT system…
Xinyuan Zhu, Zekun Fei, Enye Wang, Ruiqi He +4 more
The paper proposes TRIP-RAG, a dynamic anonymization framework that selectively anonymizes sensitive entities in knowledge bases used for RAG, significantly improving utility while maintaining strong…
The paper quantifies the cost of privacy in language identification and generation using differentially private (DP) methods, finding that the cost is surprisingly mild, particularly absent under appr…
The paper introduces AURA, an LLM-powered mask-reconstruct framework, to improve text anonymization by enhancing resistance to agentic web-search re-identification while better preserving contextual u…
The paper introduces AURA, an LLM-powered mask-reconstruct framework, to improve text anonymization by enhancing resistance to agentic web-search re-identification while better preserving contextual u…
The paper systematically evaluates eight privacy-preserving techniques for LLM requests, finding that a combination of local inference, redaction, and semantic rephrasing provides the best overall pro…
This paper demonstrates that patient-facing RAG chatbots frequently expose sensitive system configurations, knowledge base details, and conversation history through client-server communication, posing…
The paper proposes RPSG, a method that uses private seeds and differential privacy to generate highly realistic and strongly privacy-preserving synthetic data replicas of private text for LLMs.
Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more
The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…