Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Qi Zhan

Qi Zhan

19 indexed papers

Recent (6 mo)
19
With code
0
Influential cites
0
Benchmarked
0

Publications per year

19
26

Top categories

Crypto×13AI×9NLP×6Software Eng.×4Robotics×2ML×2Info Retrieval×2Architecture×2

Frequent co-authors

Qi Zhang4×
Ziqi Zhang3×
Tianqi Zhang2×
Zhipeng Wei2×
Lingming Zhang2×
Dong Jing1×

Research Timeline

2026
Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety

The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that existing benchmarks miss.

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for exploit migration to improve coverage.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are distinct from inherent LLM flaws.

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detection accuracy across diverse software systems.

Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

Privatar introduces a scalable, privacy-preserving framework to offload computationally intensive multi-user avatar reconstruction from VR headsets to untrusted local devices, significantly improving user capacity while maintaining strong privacy guarantees.

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents

The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant gap between public concern and platform safeguards.

PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts

PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.

Heimdallr: Characterizing and Detecting LLM-Induced Security Risks in GitHub CI Workflows

This paper introduces Heimdallr, a novel framework that characterizes and detects LLM-induced security risks by analyzing the full execution chain of LLM integrations within GitHub CI workflows.

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning

This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiveness.

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.

Robust Reasoning via Dynamic Token Selection for Distribution-Aligned Self-Distillation

The paper proposes Distribution-Aligned Self-Distillation (DASD) to improve self-distillation by dynamically filtering high-perplexity tokens, thereby preserving useful logical knowledge while suppressing harmful stylistic biases.

ACRONYM: Accelerated Approximate Nearest Neighbor Search in Memory for Dynamic Vector Databases

ACRONYM is a novel algorithm-hardware co-designed platform that enables high-recall, continuous approximate nearest neighbor search in memory for dynamic vector databases, achieving massive throughput and low energy consumption.

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Agent libOS introduces a library-OS-inspired runtime substrate that treats LLM agents as schedulable processes, providing explicit capability control and robust auditing for long-running, stateful agent execution.

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

The paper proposes Cross-Layer Sparse Attention (CLSA) to significantly improve the efficiency and accuracy of long-context LLMs by jointly optimizing KV-cache sharing and the routing index across decoder layers.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIRecentJun 4, 2026

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

View →
cs.CLcs.AIcs.LGRecentJun 4, 2026

You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

Yutao Sun, Yanqi Zhang, Li Dong, Jianyong Wang +1 more

The paper proposes Cross-Layer Sparse Attention (CLSA) to significantly improve the efficiency and accuracy of long-context LLMs by jointly optimizing KV-cache sharing and the routing index across dec…

View →
cs.IRcs.AIcs.CLRecentJun 4, 2026

OneReason Technical Report

OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coheren…

View →
cs.CRcs.AIRecentJun 3, 2026

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

Yiqi Wang, Jiaqi Zhang, Taotao Cai, Zirui Liu +5 more

This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.

View →
cs.ARcs.DBcs.ETRecentJun 2, 2026

ACRONYM: Accelerated Approximate Nearest Neighbor Search in Memory for Dynamic Vector Databases

Md Mizanur Rahaman Nayan, Tianqi Zhang, Flavio Ponzina, Tajana Rosing +1 more

ACRONYM is a novel algorithm-hardware co-designed platform that enables high-recall, continuous approximate nearest neighbor search in memory for dynamic vector databases, achieving massive throughput…

View →
cs.OScs.AIcs.CRRecentJun 2, 2026

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Yingqi Zhang

Agent libOS introduces a library-OS-inspired runtime substrate that treats LLM agents as schedulable processes, providing explicit capability control and robust auditing for long-running, stateful age…

View →
cs.CLRecentMay 30, 2026

Robust Reasoning via Dynamic Token Selection for Distribution-Aligned Self-Distillation

Ruiqi Zhang, Lingxiang Wang, Hainan Zhang Zhiming Zheng

The paper proposes Distribution-Aligned Self-Distillation (DASD) to improve self-distillation by dynamically filtering high-perplexity tokens, thereby preserving useful logical knowledge while suppres…

View →
cs.CLcs.IRRecentMay 29, 2026

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

Han Zhang, Zihao Tang, Xin Yu, Xiao Liu +7 more

The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.

View →
cs.CLcs.CRRecentMay 26, 2026

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning

Xiaotian Ye, Xiaohan Wang, Mengqi Zhang, Shu Wu

This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiv…

View →
cs.CRcs.LGRecentMay 26, 2026

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more

The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.

View →
cs.CRRecentMay 7, 2026

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents

Jiahao Chen, Qi Zhang, Ruixiao Lin, Chunyi Zhou +6 more

The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant…

View →
cs.CRcs.AIRecentMay 7, 2026

PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts

Qinfeng Li, Yuntai Bao, Jianghui Hu, Wenqi Zhang +4 more

PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.

View →
cs.CRcs.SERecentMay 7, 2026

Heimdallr: Characterizing and Detecting LLM-Induced Security Risks in GitHub CI Workflows

Bonan Ruan, Yeqi Fu, Chuqi Zhang, Jiahao Liu +2 more

This paper introduces Heimdallr, a novel framework that characterizes and detects LLM-induced security risks by analyzing the full execution chain of LLM integrations within GitHub CI workflows.

View →
cs.CRcs.ARcs.CVRecentApr 19, 2026

Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

Jianming Tong, Hanshen Xiao, Krishna Kumar Nair, Hao Kang +4 more

Privatar introduces a scalable, privacy-preserving framework to offload computationally intensive multi-user avatar reconstruction from VR headsets to untrusted local devices, significantly improving…

View →
cs.SEcs.AIcs.CLRecentApr 13, 2026

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

Zijie Zhao, Chenyuan Yang, Weidong Wang, Yihan Yang +2 more

AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detect…

View →
cs.CRcs.AIRecentApr 9, 2026

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more

This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…

View →
cs.CRcs.AIcs.CVRecentMar 28, 2026

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…

View →
cs.SEcs.CRRecentMar 27, 2026

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…

View →
cs.SEcs.CRRecentMar 18, 2026

Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety

Xuan Chen, Lu Yan, Ruqi Zhang, Xiangyu Zhang

The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that exis…

View →