~ similar to 2604.15499v1· 20 results
Xidong Wu, Yukuan Zhang, Yuqiong Ji, Reza Shirkavand +2 more
The paper proposes PPRoute, a privacy-preserving LLM routing framework that significantly speeds up secure model selection while maintaining high performance comparable to non-private methods.
EncFormer is a novel two-party framework that significantly improves the efficiency and scalability of private Transformer inference by optimizing the combination of Fully Homomorphic Encryption (FHE)…
The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…
This paper provides a comprehensive, system-level comparison of MPC and FHE for Privacy-Preserving Machine Learning (PPML) across various models and environments, moving beyond single-metric latency a…
Lucas Fenaux, Larris Xie, Aditya Bang, Alex Zhang +2 more
The paper proposes a Public/Private Hybrid Head-VFL (PPHH-VFL) architecture that significantly accelerates secure time-series inference by splitting the model head into efficient public and secure pri…
Zekun Fei, Zihao Wang, Weijie Liu, Ruiqi He +3 more
Misrouter introduces an input-only adversarial framework to exploit the routing mechanisms of Mixture-of-Experts (MoE) LLMs, enabling unsafe behavior induction against remotely hosted, black-box servi…
This paper provides the first comprehensive review of threats and defenses specifically targeting on-device AI inference, revealing a significant imbalance where certain attack types, like adversarial…
The paper proposes a co-design paradigm, 'Meeting in the Middle,' to make Fully Homomorphic Encryption (FHE) practical for AI inference by optimizing both the cryptographic schemes and the underlying…
Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu +1 more
The paper introduces R$^2$A, an adversarial attack that uses suffix optimization to mislead black-box LLM routers into consistently selecting expensive, high-capability models.
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
This paper provides a comparative analysis and benchmarking of Secure Multi-Party Computation (SMPC) and Fully Homomorphic Encryption (FHE) for machine learning, finding that the optimal choice depend…
Bo Lv, Zhiheng Xu, KeDong Xiu, Ruyi Ding +3 more
RouteScan introduces a non-intrusive framework that audits the safety of Mixture-of-Experts (MoE) LLMs by analyzing low-level GPU expert routing telemetry, achieving high accuracy even on unseen harmf…
Zhengyi Li, Yakai Wang, Kang Yang, Yu Yu +5 more
This paper demonstrates a novel attack against the shuffling defense used in secure Transformer inference, showing that randomly permuted activations can still be exploited to recover model weights.
NANOZK introduces a novel, highly efficient zero-knowledge proof system that allows users to cryptographically verify that the output of a large language model (LLM) was generated by a specific, claim…
The paper introduces 'Routing Hijacking,' a severe attack where malicious clients forge semantic profiles in Federated RAG systems to misroute target queries, and proposes a trust-aware post-routing f…
This paper develops optimized algorithms and a pipeline architecture for high-throughput, memory-efficient batch processing of encrypted neural network inference, significantly improving performance o…
Tessera introduces a novel hardware architecture that achieves secure, near-line-rate weight streaming for DNNs on UMA edge accelerators by performing cache-line granularity decryption during DRAM fet…
AEGIS is a novel system that significantly improves the scalability of running large, long-sequence Transformer models under Fully Homomorphic Encryption (FHE) on multi-GPU systems by optimizing data…
Erik Bångsbo, Zakaria Hersi, Anna Benktson, Stefan Holmgren +1 more
This paper proposes and demonstrates a method to secure high-performance RDMA data transfers by implementing AES-128 encryption directly within a programmable network switch, maintaining high throughp…
The paper proposes an optimized, end-to-end privacy-preserving framework for vertical federated learning by distributing aggregation roles across multiple servers using secure multiparty computation a…