Papers similar to 2605.28208

~ similar to 2605.28208· 18 results

cs.CRcs.AIcs.LGRecentMay 21, 2026

Characterizing the Fault Response of the Intel Neural Compute Stick 2 Under Single-Pulse Electromagnetic Fault Injection

Štefan Kučerák, Jakub Breier, Xiaolu Hou

The paper systematically characterizes the fault response of the Intel NCS2 accelerator to electromagnetic fault injection, revealing a major degradation mode that is undetectable by standard inferenc…

View →

cs.CRcs.AIcs.DCRecentApr 3, 2026

AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems

Zhaoting Gong, Ran Ran, Fan Yao, Wujie Wen

AEGIS is a novel system that significantly improves the scalability of running large, long-sequence Transformer models under Fully Homomorphic Encryption (FHE) on multi-GPU systems by optimizing data…

View →

cs.ARRecentJun 1, 2026

CHIMERA: A Flexible and Scalable 3.1 TOPS/W AI-MCU with Transformer Accelerator and 563 Gb/s Shared-L2 Memory Subsystem with QoS Guarantees

Lorenzo Leone, Philip Wiese, Gamze İslamoğlu, Michael Rogenmoser +3 more

The paper introduces Chimera, a highly efficient and scalable MCU designed for ultra-low-power edge AI inference, achieving 3.1 TOPS/W by integrating a dedicated transformer accelerator and a QoS-guar…

View →

cs.CVcs.CRRecentApr 6, 2026

Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging

Md Rahatul Islam Udoy, Diego Ferrer, Wantong Li, Kai Ni +2 more

The paper proposes SecurePix, a compact CMOS-compatible pixel architecture that achieves true in-pixel encryption using FeFETs, demonstrating strong image security and low power overhead.

View →

cs.ARcs.PFRecentMay 30, 2026

Regular-Activation Concentration: Characterizing Column-Level Output Sparsity Across Diffusion Model Architectures

Dazhi Yang, Shafayat Mowla Anik, Byeong Kil Lee, Jeeho Ryoo

The paper systematically characterizes column-level activation sparsity across various diffusion model architectures, demonstrating that element-level sparsity metrics significantly overestimate the a…

View →

cs.CRcs.ARcs.PFRecentJun 1, 2026

Implementation and Optimization of HQC Decoding on NPU-Integrated Devices

Vu Minh Chau, Nguyen Ngoc Kiet, Pham Quang Minh, Mai Xuan Ngoc +2 more

This paper optimizes the decoding of Hamming Quasi-Cyclic (HQC) codes for post-quantum cryptography on NPU-integrated mobile devices by redesigning the core kernels to leverage the Hexagon Vector eXte…

View →

cs.CRcs.ARcs.PFRecentJun 1, 2026

Implementation and Optimization of HQC Decoding on NPU-Integrated Devices

Vu Minh Chau, Nguyen Ngoc Kiet, Pham Quang Minh, Mai Xuan Ngoc +2 more

This paper optimizes the decoding of Hamming Quasi-Cyclic (HQC) codes for post-quantum cryptography on NPU-integrated mobile devices by redesigning the kernels to leverage the Hexagon Vector eXtension…

View →

cs.ARcs.MSRecentJun 3, 2026

GoldenFloat: A Phi-Derived Static-Split Floating-Point Family from GF4 to GF256 with a Lucas-Exact Integer Identity

Dmitrii Vasiliev

This paper presents a hardware-oriented description of GoldenFloat, a static-split floating-point family, and its concrete artefacts.

View →

cs.CRcs.LGRecentMar 25, 2026

Efficient Encrypted Computation in Convolutional Spiking Neural Networks with TFHE

Longfei Guo, Pengbo Li, Ting Gao, Yonghai Zhong +2 more

The paper introduces FHE-DiCSNN, a novel framework that uses the TFHE scheme to enable secure and efficient computation on Spiking Neural Networks (SNNs), achieving high accuracy and fast inference ti…

View →

cs.CRRecentApr 17, 2026

Low-Stack HAETAE for Memory-Constrained Microcontrollers

Gustavo Banegas, Kim Youngbeom, Seo Seog Chung, Vredendaal Christine Van

The paper presents a highly optimized, low-stack implementation of the HAETAE signature scheme, reducing peak stack usage significantly to enable its use on severely memory-constrained microcontroller…

View →

cs.ARRecentJun 1, 2026

O-POPE: High-Frequency Pipelined Outer Product based GEMM acceleration with minimal buffering overhead

Danilo Cammarata, Angelo Garofalo, Luca Benini

O-POPE is a novel outer-product engine that accelerates floating-point GEMM by repurposing FPU pipeline registers as buffers, achieving high utilization and improved energy efficiency.

View →

cs.ARRecentMay 29, 2026

A Reconfigurable Computing In-Memory Macro with Charge-sharing-based Weighted Accumulator

Junyi Yang, Shuai Dong, Zhengnan Fu, Hongyang Shang +1 more

The paper proposes a highly reconfigurable 256x128 in-memory computing array that significantly improves efficiency and performance for analog computing by introducing novel components for ADC, weight…

View →

cs.LGcs.AIcs.DCRecentMay 27, 2026

How Far Can Disaggregation Go? A Design-Space Exploration of Attention-FFN Disaggregation for Efficient MoE LLM Serving

Hanjiang Wu, Abhimanyu Rajeshkumar Bambhaniya, Sarbartha Banerjee, Tuhin Khare +8 more

The paper systematically analyzes the benefits and limits of Attention-FFN Disaggregation (AFD) for Mixture-of-Experts (MoE) LLM serving, demonstrating that AFD is crucial for achieving high throughpu…

View →

cs.CRRecentApr 16, 2026

Structural Dependency Analysis for Masked NTT Hardware: Scalable Pre-Silicon Verification of Post-Quantum Cryptographic Accelerators

Ray Iskander, Khaled Kirah

The paper introduces a four-stage structural dependency analysis hierarchy that enables scalable, sound first-order masking verification for large, production-level post-quantum cryptographic accelera…

View →

cs.CRcs.ARRecentApr 22, 2026

PVAC: A RowHammer Mitigation Architecture Exploiting Per-victim-row Counting

Jumin Kim, Seungmin Baek, Hwayong Nam, Minbok Wi +2 more

The paper introduces PVAC, a novel victim-based row counting mechanism that accurately tracks RowHammer attacks by incrementing counters on the victim row, thereby improving hammering tolerance and pe…

View →

cs.CVcs.AIRecentMay 28, 2026

Pocket-Dentist: On-Device Dental Image Understanding via Efficient Multimodal Large Language Models

Kai Bian, Xucheng Guo, Bin Chen, Lingyan Ruan +3 more

The paper introduces Pocket-Dentist, an efficiency-aware benchmark and model that demonstrates that compact, smaller Vision-Language Models (VLMs) can outperform larger models in accuracy while drasti…

View →

cs.CRRecentMay 6, 2026

A Pragmatic Comparison of Cryptographic Computation Technologies for Machine Learning

Marcus Taubert, Adam Skuta, Thomas Loruenser

This paper provides a comparative analysis and benchmarking of Secure Multi-Party Computation (SMPC) and Fully Homomorphic Encryption (FHE) for machine learning, finding that the optimal choice depend…

View →

cs.CRcs.AIcs.CVRecentApr 6, 2026

SE-Enhanced ViT and BiLSTM-Based Intrusion Detection for Secure IIoT and IoMT Environments

Afrah Gueriani, Hamza Kheddar, Ahmed Cherif Mazari, Seref Sagiroglu +1 more

The paper proposes an SE ViT-BiLSTM hybrid model for enhanced intrusion detection in IIoT and IoMT environments, achieving superior performance on real-world datasets, especially after data balancing.

View →