Lin Chen
13 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard model that effectively detects prompt injection attacks in vulnerable web agents without compromising their functionality.
The paper proposes WARD, a robust and efficient defense model that secures web agents against prompt injection attacks embedded in web content, achieving high recall and low false positives even against adaptive attacks.
The paper introduces AsyncTool, a new benchmark designed to evaluate LLM agents' ability to handle multiple, concurrent tasks with delayed tool feedback, demonstrating that asynchronous coordination is a significant challenge for current models.
Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic tasks and embodiments.
The paper proposes formulating RAG design as an architecture search problem and introduces RAISE, a comprehensive framework and benchmark for systematically optimizing RAG hyperparameters.
AliMark proposes a novel watermarking framework that treats sentence-level watermarking as a bit sequence alignment problem, significantly enhancing robustness against structural text perturbations like sentence splitting and merging.
EvoMD-LLM introduces a novel framework that models reactive molecular dynamics as a symbolic temporal language problem, enabling LLMs to accurately predict complex, time-evolving chemical processes.
LoopFM proposes a novel framework to significantly improve knowledge distillation for recommendation systems by structuring the rich intermediate embeddings of large foundation models as input features, thereby overcoming the limitations of single-scalar prediction transfer.
AliMark proposes a novel framework that enhances the robustness of sentence-level watermarking by reformulating the problem as a bit sequence encoding and alignment task, significantly improving resilience against structural text perturbations like sentence splitting and merging.
The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training and rollout.
The paper proposes Preference Delta Aggregation (PDA), a framework that aggregates multiple weak preference signals derived from smaller model pairs using LoRA merging to significantly boost the performance of a strong large language model.
AnchorSteer introduces a framework that achieves high-fidelity, structure-preserving music editing by decoupling semantic concept injection from structural constraints.
The paper proposes AsymCache, a computation-latency-aware KV cache management system that optimizes LLM inference by aligning cache eviction decisions with GPU attention kernel performance, significantly reducing both Time-to-First-Token (TTFT) and Time-Per-Output-Token (TPOT).
Papers
Multi-Segment Attention: Enabling Efficient KV-Cache Management for Faster Large Language Model Serving
Chunan Shi, Yilei Chen, Yilin Chen, Xupeng Miao +1 more
The paper proposes AsymCache, a computation-latency-aware KV cache management system that optimizes LLM inference by aligning cache eviction decisions with GPU attention kernel performance, significan…