~ similar to 2604.04696v2· 20 results
Ofir Dvir, Kali Hale, Javin Zipkin, Divyakant Agrawal +1 more
The paper introduces SPIDER, a novel single-server Private Information Retrieval (PIR) scheme that achieves state-of-the-art communication complexity without requiring specialized server cooperation o…
The paper introduces local private information retrieval (local PIR), redefining user privacy in graph-replicated systems to focus on hiding the message index from servers, and demonstrates that local…
The paper proposes a novel, unconditionally secure information-theoretic Authenticated Private Information Retrieval (itAPIR) scheme that upgrades existing, less secure itPIR-RV schemes without overhe…
Harshita Gupta, Mayank Kabra, Jaewoo Park, Priyam Mehta +8 more
The paper characterizes Homomorphic Encryption (HE) operations on a real-world Processing-In-Memory (PIM) system, demonstrating that while PIM is a viable alternative to CPUs/GPUs, performance is limi…
The paper proposes a novel ring-based information-theoretic Private Information Retrieval (itED-PIR) scheme that overcomes the key size and communication overhead limitations of existing field-based A…
The paper characterizes 'dead-entry' TLB misses in GPUs, which occur when recently evicted translations are immediately re-walked, and proposes DEPOT, a Bloom filter mechanism that significantly reduc…
This paper generalizes the definition of privacy in graph-replicated Private Information Retrieval (PIR) by allowing each server to have an arbitrary, specific set of message indices it must keep priv…
This paper presents a cryptanalytic attack demonstrating that a specific code-based Private Information Retrieval (PIR) scheme can be broken, allowing the server to efficiently determine the requested…
This paper empirically evaluates the performance of the Polars DataFrame engine running within Intel SGX2 enclaves, finding that while the overall security overhead is manageable, the performance is s…
Guanlong Wu, Zhaohan li, Yao Zhang, Zheng Zhang +3 more
CachePrune introduces a privacy-aware, fine-grained KV cache sharing mechanism that allows LLM inference systems to safely reuse cache entries across users' requests, significantly improving efficienc…
Chris S. Lin, Yuqin Yan, Guozhen Ding, Joyce Qu +3 more
This paper demonstrates a novel GPU-side privilege escalation attack, showing that Rowhammer can be used to target and tamper with page tables to gain unauthorized access to co-tenant memory and ultim…
This paper investigates the potential of real-world Processing-in-Memory (PIM) architectures, specifically using UPMEM, to accelerate cryptographic algorithms, demonstrating that distributing computat…
Guoci Chen, Xiurui Pan, Qiao Li, Bo Mao +4 more
The paper introduces TIGER, a GPU-accelerated framework that significantly speeds up high-precision evaluation of nonlinear layers for encrypted LLM inference using TFHE.
Zhijun Li, Minghui Xu, Huayi Qi, Wenxuan Yu +5 more
PRAG is an end-to-end privacy-preserving Retrieval-Augmented Generation (RAG) system that maintains high retrieval accuracy and scalability in cloud environments by encrypting both documents and queri…
Tessera introduces a novel hardware architecture that achieves secure, near-line-rate weight streaming for DNNs on UMA edge accelerators by performing cache-line granularity decryption during DRAM fet…
Gangmuk Lim, Wanyu Zhao, Brighten Godfrey, Jiaxin Shan +2 more
Lodestar is a novel online learning-based request routing system that significantly improves LLM inference efficiency by dynamically assigning incoming requests to the optimal GPU instance to minimize…
ACRONYM is a novel algorithm-hardware co-designed platform that enables high-recall, continuous approximate nearest neighbor search in memory for dynamic vector databases, achieving massive throughput…
The paper proposes PrISM, an intersection-based probabilistic mitigation technique that significantly improves the scalability of RowHammer defense at low thresholds by correlating sampled row history…
The paper introduces Rotary GPU, an exploratory execution approach demonstrating that large Mixture-of-Experts models can be run locally on consumer GPUs with limited VRAM, achieving usable decode thr…
Darya Kaviani, Alp Eren Ozdarendeli, Jinhao Zhu, Yu Ding +1 more
Opal is a private memory system for personal AI that maintains high retrieval accuracy and throughput while ensuring data privacy by confining all data-dependent reasoning to a trusted hardware enclav…