~ similar to 2605.04901v1· 20 results
EncFormer is a novel two-party framework that significantly improves the efficiency and scalability of private Transformer inference by optimizing the combination of Fully Homomorphic Encryption (FHE)…
The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…
The paper introduces Rotated Robustness (RoR), a training-free defense that uses orthogonal transformations to prevent catastrophic model collapse in LLMs caused by hardware bit-flip attacks.
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
The paper proposes the first general defense framework to make all union-preserving Differential Privacy (DP) protocols, specifically those based on shuffle-DP, resilient against poisoning attacks.
The paper analyzes the security of a partially masked hardware accelerator for Number Theoretic Transform (NTT) in PQC, demonstrating that the claimed security margins are significantly overestimated…
Ziyang You, Xiaoke Yang, Zhanling Fan, Feng Guo +2 more
The paper introduces SeedHijack, a backdoor attack that manipulates the pseudorandom number generation process in LLMs to force specific token selections, and proposes a hardware quantum random number…
SecureRouter is an encrypted routing and inference framework that accelerates secure transformer inference by adaptively selecting the optimal model size based on the encrypted input, achieving a 1.95…
The paper proposes a method for bit-exact verification of AI inference outputs without sacrificing performance, demonstrating that deterministic, precise re-computation is possible even across differe…
This paper provides a comparative analysis and benchmarking of Secure Multi-Party Computation (SMPC) and Fully Homomorphic Encryption (FHE) for machine learning, finding that the optimal choice depend…
This paper develops optimized algorithms and a pipeline architecture for high-throughput, memory-efficient batch processing of encrypted neural network inference, significantly improving performance o…
AEGIS is a novel system that significantly improves the scalability of running large, long-sequence Transformer models under Fully Homomorphic Encryption (FHE) on multi-GPU systems by optimizing data…
The paper introduces DiffusionHijack, a supply-chain backdoor attack that compromises the PRNG used by diffusion models to deterministically control generated images, which is successfully mitigated b…
This paper demonstrates that the Euston secure inference framework, which uses SVD-based matrix transmission to save bandwidth, leaks private input data by exploiting subspace leakage of random masks.
The paper introduces ParDef, a generalized defense mechanism that effectively mitigates various types of parameter attacks on deep neural networks while maintaining high performance.
Guoci Chen, Xiurui Pan, Qiao Li, Bo Mao +4 more
The paper introduces TIGER, a GPU-accelerated framework that significantly speeds up high-precision evaluation of nonlinear layers for encrypted LLM inference using TFHE.
Lucas Fenaux, Larris Xie, Aditya Bang, Alex Zhang +2 more
The paper proposes a Public/Private Hybrid Head-VFL (PPHH-VFL) architecture that significantly accelerates secure time-series inference by splitting the model head into efficient public and secure pri…
LoREnc is a novel, training-free framework that secures Foundation Models (FMs) and LoRA adapters against intellectual property leakage and model recovery attacks by spectrally truncating weights and…
The paper introduces Asymmetric Langevin Unlearning (ALU), a novel framework that uses public data to significantly reduce the utility loss typically associated with certified machine unlearning, enab…
Voktho Das, M Zafir Sadik Khan, Jafar Vafaei, Kimia Azar +1 more
The paper proposes a hybrid ASIC+eFPGA architecture to enhance the security and resilience of edge LLM inference accelerators against both runtime and supply-chain attacks.