~ similar to 2603.28972v1· 20 results
The paper introduces a Contextual Integrity (CI) framework and a new benchmark (DelegateCI-Bench) to rewrite user queries sent to cloud LLMs, ensuring only task-essential information is retained while…
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
The paper systematically evaluates eight privacy-preserving techniques for LLM requests, finding that a combination of local inference, redaction, and semantic rephrasing provides the best overall pro…
BodhiPromptShield is a policy-aware framework that mediates prompt privacy by detecting sensitive data and replacing it with secure placeholders across multiple stages (retrieval, memory, tools) to pr…
The paper introduces $(l, b)$-inextractability, a new formal measure that demonstrates that standard indistinguishability properties are insufficient for guaranteeing protection against data extractio…
Xiaodong Li, Yuhua Wang, Qingchen Yu, Zixuan Qin +4 more
The paper proposes DAMPER, a domain-aware framework that autonomously extracts and rewrites private information from text while providing rigorous differential privacy guarantees, significantly improv…
The paper introduces Prompt Control-Flow Integrity (PCFI), a priority-aware runtime defense that models LLM prompts as structured segments to intercept prompt injection attacks with high accuracy and…
Erchi Wang, Pengrun Huang, Eli Chien, Om Thakkar +3 more
The paper introduces DPrivBench, a new benchmark to test whether large language models (LLMs) can automate the complex reasoning required to verify differential privacy guarantees for algorithms.
Bo Lv, Zhiheng Xu, KeDong Xiu, Ruyi Ding +3 more
RouteScan introduces a non-intrusive framework that audits the safety of Mixture-of-Experts (MoE) LLMs by analyzing low-level GPU expert routing telemetry, achieving high accuracy even on unseen harmf…
This paper introduces a fingerprinting method that exploits subtle numerical deviations in the inference system components (like the engine or hardware) to reliably identify the specific components us…
Pengcheng Sun, Lan Zhang, Zhaopeng Zhang, Jiewei Lai +1 more
Permit is a novel framework that enforces fine-grained, permission-aware control over the hidden states of LLMs, preventing information leakage even when sensitive data is present in the context.
Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more
This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…
Guanlong Wu, Zhaohan li, Yao Zhang, Zheng Zhang +3 more
CachePrune introduces a privacy-aware, fine-grained KV cache sharing mechanism that allows LLM inference systems to safely reuse cache entries across users' requests, significantly improving efficienc…
The paper introduces GuardPhish, a large-scale dataset and evaluation framework, demonstrating that even high-performing open-source LLMs can generate actionable phishing content despite accurate inte…
The paper proposes an architectural proxy (MCP) to enforce robust, reliable tool access control for LLM agents, demonstrating that this structural enforcement is necessary because prompt-based restric…
Wenjie Fu, Xiaoting Qin, Jue Zhang, Qingwei Lin +4 more
The paper introduces CI-Work, a benchmark demonstrating that current enterprise LLM agents frequently leak sensitive information while performing tasks, suggesting that privacy protection requires arc…
Weijun Li, Arnaud Grivet Sébert, Qiongkai Xu, Annabelle McIver +1 more
The paper proposes an empirical calibration method, TeDA, to provide a more comparable and interpretable assessment of privacy loss for text rewriting mechanisms under Local Differential Privacy (LDP)…
Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more
AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…