Qi Zhan
19 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that existing benchmarks miss.
This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for exploit migration to improve coverage.
This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are distinct from inherent LLM flaws.
AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detection accuracy across diverse software systems.
Privatar introduces a scalable, privacy-preserving framework to offload computationally intensive multi-user avatar reconstruction from VR headsets to untrusted local devices, significantly improving user capacity while maintaining strong privacy guarantees.
The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant gap between public concern and platform safeguards.
PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.
This paper introduces Heimdallr, a novel framework that characterizes and detects LLM-induced security risks by analyzing the full execution chain of LLM integrations within GitHub CI workflows.
This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiveness.
The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.
The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.
The paper proposes Distribution-Aligned Self-Distillation (DASD) to improve self-distillation by dynamically filtering high-perplexity tokens, thereby preserving useful logical knowledge while suppressing harmful stylistic biases.
ACRONYM is a novel algorithm-hardware co-designed platform that enables high-recall, continuous approximate nearest neighbor search in memory for dynamic vector databases, achieving massive throughput and low energy consumption.
Agent libOS introduces a library-OS-inspired runtime substrate that treats LLM agents as schedulable processes, providing explicit capability control and robust auditing for long-running, stateful agent execution.
This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.
TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.
The paper proposes Cross-Layer Sparse Attention (CLSA) to significantly improve the efficiency and accuracy of long-context LLMs by jointly optimizing KV-cache sharing and the routing index across decoder layers.
The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.
Papers
TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies
Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more
TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.