~ similar to 2605.07160v1· 20 results
This paper develops optimized algorithms and a pipeline architecture for high-throughput, memory-efficient batch processing of encrypted neural network inference, significantly improving performance o…
Onyx proposes a novel, cost-efficient disk-oblivious Approximate Nearest Neighbor (ANN) search system that significantly reduces both cost and latency compared to state-of-the-art methods.
This paper provides a comparative analysis and benchmarking of Secure Multi-Party Computation (SMPC) and Fully Homomorphic Encryption (FHE) for machine learning, finding that the optimal choice depend…
This paper introduces a dual-layer side-channel attack framework that exploits the variable workload introduced by dynamic image preprocessing in local Vision-Language Models (VLMs) to infer sensitive…
Cloak is an oblivious storage system that significantly improves the performance of ORAM by exploiting temporal locality, achieving low overheads while maintaining security.
This paper presents a novel data-free Membership Inference Attack (MIA) that uses gradient inversion on Standard Cell Library Layouts (SCLLs) to reconstruct sensitive hardware images from intercepted…
Guanlong Wu, Zhaohan li, Yao Zhang, Zheng Zhang +3 more
CachePrune introduces a privacy-aware, fine-grained KV cache sharing mechanism that allows LLM inference systems to safely reuse cache entries across users' requests, significantly improving efficienc…
Tessera introduces a novel hardware architecture that achieves secure, near-line-rate weight streaming for DNNs on UMA edge accelerators by performing cache-line granularity decryption during DRAM fet…
CHRONOS is a hardware-assisted framework that significantly reduces the latency of secure federated learning by decoupling cryptographic key setup from the active training phase, while maintaining hig…
Harshita Gupta, Mayank Kabra, Jaewoo Park, Priyam Mehta +8 more
The paper characterizes Homomorphic Encryption (HE) operations on a real-world Processing-In-Memory (PIM) system, demonstrating that while PIM is a viable alternative to CPUs/GPUs, performance is limi…
This paper provides a comprehensive, system-level comparison of MPC and FHE for Privacy-Preserving Machine Learning (PPML) across various models and environments, moving beyond single-metric latency a…
The paper proves that platform-deterministic inference is a necessary and sufficient condition for trustworthy AI, establishing that AI trust fundamentally relies on consistent arithmetic.
The paper presents a highly optimized, low-stack implementation of the HAETAE signature scheme, reducing peak stack usage significantly to enable its use on severely memory-constrained microcontroller…
Aiman Al Masoud, Antony Anju, Marco Arazzi, Mert Cihangiroglu +5 more
This paper provides the first comprehensive Systematization of Knowledge (SoK) on the security aspects of LLM-as-a-Judge (LaaJ) systems, identifying key vulnerabilities and proposing a taxonomy for fu…
Darya Kaviani, Alp Eren Ozdarendeli, Jinhao Zhu, Yu Ding +1 more
Opal is a private memory system for personal AI that maintains high retrieval accuracy and throughput while ensuring data privacy by confining all data-dependent reasoning to a trusted hardware enclav…
The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…
The paper introduces PoSME, a cryptographic primitive that enforces strict sequential memory execution by chaining data-dependent writes, providing verifiable delay and authorship attestation.
Guoci Chen, Xiurui Pan, Qiao Li, Bo Mao +4 more
The paper introduces TIGER, a GPU-accelerated framework that significantly speeds up high-precision evaluation of nonlinear layers for encrypted LLM inference using TFHE.
The paper introduces SMA-DP-SGD, a Spectral Memory-Aware Differential Privacy method that enhances standard DP-SGD by incorporating a memory branch derived from past noisy updates, improving model uti…
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.