Qi Zhan

19 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×13AI×9NLP×6Software Eng.×4Robotics×2ML×2Info Retrieval×2Architecture×2

Frequent co-authors

Qi Zhang4×

Ziqi Zhang3×

Tianqi Zhang2×

Zhipeng Wei2×

Lingming Zhang2×

Dong Jing1×

Research Timeline

2026

Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety

The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that existing benchmarks miss.

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for exploit migration to improve coverage.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are distinct from inherent LLM flaws.

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detection accuracy across diverse software systems.

Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

Privatar introduces a scalable, privacy-preserving framework to offload computationally intensive multi-user avatar reconstruction from VR headsets to untrusted local devices, significantly improving user capacity while maintaining strong privacy guarantees.

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents

The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant gap between public concern and platform safeguards.

PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts

PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.

Heimdallr: Characterizing and Detecting LLM-Induced Security Risks in GitHub CI Workflows

This paper introduces Heimdallr, a novel framework that characterizes and detects LLM-induced security risks by analyzing the full execution chain of LLM integrations within GitHub CI workflows.

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning

This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiveness.

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.

Robust Reasoning via Dynamic Token Selection for Distribution-Aligned Self-Distillation

The paper proposes Distribution-Aligned Self-Distillation (DASD) to improve self-distillation by dynamically filtering high-perplexity tokens, thereby preserving useful logical knowledge while suppressing harmful stylistic biases.

ACRONYM: Accelerated Approximate Nearest Neighbor Search in Memory for Dynamic Vector Databases

ACRONYM is a novel algorithm-hardware co-designed platform that enables high-recall, continuous approximate nearest neighbor search in memory for dynamic vector databases, achieving massive throughput and low energy consumption.

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Agent libOS introduces a library-OS-inspired runtime substrate that treats LLM agents as schedulable processes, providing explicit capability control and robust auditing for long-running, stateful agent execution.

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

The paper proposes Cross-Layer Sparse Attention (CLSA) to significantly improve the efficiency and accuracy of long-context LLMs by jointly optimizing KV-cache sharing and the routing index across decoder layers.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIRecentJun 4, 2026

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

View →

cs.CLcs.AIcs.LGRecentJun 4, 2026