~ similar to 2605.02255v1· 20 results
This paper introduces a fingerprinting method that exploits subtle numerical deviations in the inference system components (like the engine or hardware) to reliably identify the specific components us…
Rui Wen, Mark Russinovich, Andrew Paverd, Jun Sakuma +1 more
The paper introduces MetaBackdoor, a novel class of LLM backdoor attacks that exploits positional encoding (length-based triggers) rather than requiring modifications to the textual content.
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
This paper introduces Back-Reveal, an attack demonstrating that backdoored LLM agents can systematically exfiltrate sensitive user data by embedding semantic triggers into tool-use mechanisms.
Xingyu Lyu, Jianfeng He, Ning Wang, Yidan Hu +4 more
The paper proposes ADAM, a novel and highly effective privacy attack that systematically extracts sensitive data from LLM agent memory by adaptively querying the victim agent's memory based on data di…
Yu Cui, Ruiqing Yue, Hang Fu, Sicheng Pan +5 more
The paper introduces extsc{Spore}, a novel, training-free, and highly efficient privacy extraction attack that targets sensitive information stored in the memory of LLM agents during inference, outpe…
The paper introduces AutoMIA, a novel framework that uses LLM agents to automate the discovery and implementation of Membership Inference Attacks (MIAs), achieving state-of-the-art performance by syst…
This paper demonstrates that LLM cascade systems, designed for efficiency, are vulnerable to targeted adversarial attacks that simultaneously degrade both performance and cost-efficiency.
Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more
This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…
The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…
This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…
Jiahao Chen, Qi Zhang, Ruixiao Lin, Chunyi Zhou +6 more
The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant…
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
Jiali Wei, Ming Fan, Guoheng Sun, Xicheng Zhang +2 more
The paper introduces BadStyle, a novel backdoor attack framework that generates natural, stealthy poisoned samples using LLMs to compromise various LLMs with high success rates and robust activation.
Zeyuan Chen, Yihan Ma, Xinyue Shen, Michael Backes +1 more
The PopQuiz Attack is a novel black-box membership inference attack that successfully tests whether large language models memorize specific training data by framing the target data as multiple-choice…
The paper introduces $(l, b)$-inextractability, a new formal measure that demonstrates that standard indistinguishability properties are insufficient for guaranteeing protection against data extractio…
The paper introduces PrivacySIM, an evaluation suite that benchmarks how well LLMs can simulate individual user privacy decisions based on persona attributes, finding that while conditioning improves…
The paper introduces CYBERMASKQA, a novel privacy-aware benchmark designed to evaluate Large Language Models' ability to perform accurate cybersecurity question answering while simultaneously preservi…
The paper proposes a new evaluation framework showing that, under realistic conditions, Membership Inference Attacks (MIAs) are weak privacy threats, suggesting that relying on them as a primary priva…
This paper provides the first comprehensive review of threats and defenses specifically targeting on-device AI inference, revealing a significant imbalance where certain attack types, like adversarial…