~ similar to 2605.23640v1· 20 results
Peihua Mai, Xuanrong Gao, Youlong Ding, Xianglong Du +2 more
SharedRequest introduces a model-agnostic framework that enhances LLM privacy and efficiency by batching and mixing prompts with noisy variants, achieving high utility and significant cost reduction.
The paper systematically evaluates eight privacy-preserving techniques for LLM requests, finding that a combination of local inference, redaction, and semantic rephrasing provides the best overall pro…
Xingyu Lyu, Jianfeng He, Ning Wang, Yidan Hu +4 more
The paper proposes ADAM, a novel and highly effective privacy attack that systematically extracts sensitive data from LLM agent memory by adaptively querying the victim agent's memory based on data di…
Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more
This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…
This paper introduces a fingerprinting method that exploits subtle numerical deviations in the inference system components (like the engine or hardware) to reliably identify the specific components us…
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
The paper introduces CIPL, a unified channel-oriented framework, demonstrating that privacy leakage in LLM agents is governed by observable data channels and pipeline interactions, rather than being l…
The paper analyzes the bit-flip vulnerability of shared KV-cache blocks in LLM serving systems, demonstrating that these blocks are susceptible to silent, persistent, and selective data corruption.
Zhijun Li, Minghui Xu, Huayi Qi, Wenxuan Yu +5 more
PRAG is an end-to-end privacy-preserving Retrieval-Augmented Generation (RAG) system that maintains high retrieval accuracy and scalability in cloud environments by encrypting both documents and queri…
Yifei Ge, Zhenpeng Chen, Weisong Sun, Yuchen Chen +6 more
The paper proposes a novel test-driven pipeline that simulates realistic code generation scenarios to detect privacy leaks in LLMs, achieving a 2.56x increase in detected leakage compared to existing…
Chunan Shi, Yilei Chen, Yilin Chen, Xupeng Miao +1 more
The paper proposes AsymCache, a computation-latency-aware KV cache management system that optimizes LLM inference by aligning cache eviction decisions with GPU attention kernel performance, significan…
Yu Cui, Ruiqing Yue, Hang Fu, Sicheng Pan +5 more
The paper introduces extsc{Spore}, a novel, training-free, and highly efficient privacy extraction attack that targets sensitive information stored in the memory of LLM agents during inference, outpe…
Yunze Zhao, Yibo Zhao, Yuchen Zhang, Zaoxing Liu +1 more
The paper introduces GRIEF, a greybox fuzzer that discovers critical, concurrency-related vulnerabilities in LLM serving systems by treating timed multi-request traces as inputs, finding issues like c…
Lucas Fenaux, Larris Xie, Aditya Bang, Alex Zhang +2 more
The paper proposes a Public/Private Hybrid Head-VFL (PPHH-VFL) architecture that significantly accelerates secure time-series inference by splitting the model head into efficient public and secure pri…
Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more
This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…
The paper identifies a universal, statistically predictable distribution (Mandelbrot) governing LLM outputs, enabling a highly efficient, model-agnostic scoring primitive for provenance and quality as…
The paper introduces a Contextual Integrity (CI) framework and a new benchmark (DelegateCI-Bench) to rewrite user queries sent to cloud LLMs, ensuring only task-essential information is retained while…
This paper introduces Back-Reveal, an attack demonstrating that backdoored LLM agents can systematically exfiltrate sensitive user data by embedding semantic triggers into tool-use mechanisms.
Yining Chen, Jihao Zhao, Bo Tang, Haofen Wang +4 more
MemPrivacy introduces a novel framework that protects sensitive user data in edge-cloud memory systems by replacing private spans with semantically structured placeholders, thereby minimizing data exp…