Lei Chen
8 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces CREBench, a comprehensive benchmark for evaluating Large Language Models (LLMs) on cryptographic binary reverse engineering, finding that while LLMs show promise, human experts still maintain a significant advantage.
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are distinct from inherent LLM flaws.
The paper proposes a quantum-resistant quantum teleportation (QRQT) framework using post-quantum cryptography to secure the classical channel, establishing maximum secure communication distances and analyzing the impact of various classical bit leakage models.
This paper introduces personalized mechanisms for estimating streaming statistics under $w$-event personalized differential privacy, significantly improving accuracy compared to existing methods.
The paper reframes context distillation as a latent memory management problem, proposing a modular framework using LoRA adapters and a Self-Gating mechanism for efficient, selective memory retrieval and activation.
The paper introduces DistractionIF, a benchmark showing that larger LLMs are paradoxically less robust to benign, instruction-like noise in reference text, suggesting reinforcement learning can restore this robustness.
The paper introduces TravelEval, a comprehensive, six-dimensional benchmarking framework that evaluates LLM-powered travel plans using realistic spatio-temporal simulation, revealing that current LLMs struggle with globally-optimized, multi-dimensional planning.
The paper proposes AsymCache, a computation-latency-aware KV cache management system that optimizes LLM inference by aligning cache eviction decisions with GPU attention kernel performance, significantly reducing both Time-to-First-Token (TTFT) and Time-Per-Output-Token (TPOT).
Papers
Multi-Segment Attention: Enabling Efficient KV-Cache Management for Faster Large Language Model Serving
Chunan Shi, Yilei Chen, Yilin Chen, Xupeng Miao +1 more
The paper proposes AsymCache, a computation-latency-aware KV cache management system that optimizes LLM inference by aligning cache eviction decisions with GPU attention kernel performance, significan…