~ similar to 2605.28208· 18 results
The paper systematically characterizes the fault response of the Intel NCS2 accelerator to electromagnetic fault injection, revealing a major degradation mode that is undetectable by standard inferenc…
AEGIS is a novel system that significantly improves the scalability of running large, long-sequence Transformer models under Fully Homomorphic Encryption (FHE) on multi-GPU systems by optimizing data…
The paper introduces Chimera, a highly efficient and scalable MCU designed for ultra-low-power edge AI inference, achieving 3.1 TOPS/W by integrating a dedicated transformer accelerator and a QoS-guar…
Md Rahatul Islam Udoy, Diego Ferrer, Wantong Li, Kai Ni +2 more
The paper proposes SecurePix, a compact CMOS-compatible pixel architecture that achieves true in-pixel encryption using FeFETs, demonstrating strong image security and low power overhead.
The paper systematically characterizes column-level activation sparsity across various diffusion model architectures, demonstrating that element-level sparsity metrics significantly overestimate the a…
Vu Minh Chau, Nguyen Ngoc Kiet, Pham Quang Minh, Mai Xuan Ngoc +2 more
This paper optimizes the decoding of Hamming Quasi-Cyclic (HQC) codes for post-quantum cryptography on NPU-integrated mobile devices by redesigning the core kernels to leverage the Hexagon Vector eXte…
Vu Minh Chau, Nguyen Ngoc Kiet, Pham Quang Minh, Mai Xuan Ngoc +2 more
This paper optimizes the decoding of Hamming Quasi-Cyclic (HQC) codes for post-quantum cryptography on NPU-integrated mobile devices by redesigning the kernels to leverage the Hexagon Vector eXtension…
This paper presents a hardware-oriented description of GoldenFloat, a static-split floating-point family, and its concrete artefacts.
Longfei Guo, Pengbo Li, Ting Gao, Yonghai Zhong +2 more
The paper introduces FHE-DiCSNN, a novel framework that uses the TFHE scheme to enable secure and efficient computation on Spiking Neural Networks (SNNs), achieving high accuracy and fast inference ti…
The paper presents a highly optimized, low-stack implementation of the HAETAE signature scheme, reducing peak stack usage significantly to enable its use on severely memory-constrained microcontroller…
O-POPE is a novel outer-product engine that accelerates floating-point GEMM by repurposing FPU pipeline registers as buffers, achieving high utilization and improved energy efficiency.
Junyi Yang, Shuai Dong, Zhengnan Fu, Hongyang Shang +1 more
The paper proposes a highly reconfigurable 256x128 in-memory computing array that significantly improves efficiency and performance for analog computing by introducing novel components for ADC, weight…
The paper systematically analyzes the benefits and limits of Attention-FFN Disaggregation (AFD) for Mixture-of-Experts (MoE) LLM serving, demonstrating that AFD is crucial for achieving high throughpu…
The paper introduces a four-stage structural dependency analysis hierarchy that enables scalable, sound first-order masking verification for large, production-level post-quantum cryptographic accelera…
Jumin Kim, Seungmin Baek, Hwayong Nam, Minbok Wi +2 more
The paper introduces PVAC, a novel victim-based row counting mechanism that accurately tracks RowHammer attacks by incrementing counters on the victim row, thereby improving hammering tolerance and pe…
Kai Bian, Xucheng Guo, Bin Chen, Lingyan Ruan +3 more
The paper introduces Pocket-Dentist, an efficiency-aware benchmark and model that demonstrates that compact, smaller Vision-Language Models (VLMs) can outperform larger models in accuracy while drasti…
This paper provides a comparative analysis and benchmarking of Secure Multi-Party Computation (SMPC) and Fully Homomorphic Encryption (FHE) for machine learning, finding that the optimal choice depend…
The paper proposes an SE ViT-BiLSTM hybrid model for enhanced intrusion detection in IIoT and IoMT environments, achieving superior performance on real-world datasets, especially after data balancing.