Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Qi Li

Qi Li

34 indexed papers

Recent (6 mo)
34
With code
0
Influential cites
0
Benchmarked
0

Publications per year

34
26

Top categories

Crypto×20AI×13Vision×5NLP×4ML×2Sound×2Software Eng.×2Robotics×1

Frequent co-authors

Ke Xu4×
Xiaoqi Li4×
Wenkai Li4×
Zongwei Li4×
Jiaqi Li3×
Qi Liu3×

Research Timeline

2026
AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents

The paper introduces AgentWard, a lifecycle-oriented, defense-in-depth architecture designed to systematically secure autonomous AI agents by protecting them across all stages of their operation.

MelShield: Robust Mel-Domain Audio Watermarking for Provenance Attribution of AI Generated Synthesized Speech

MelShield is a robust, in-generation audio watermarking framework that embeds identifiable signals into AI-generated speech in the Mel-spectrogram domain for reliable copyright protection and attribution.

Phishing Detection in Ethereum via Temporal Graph Contrastive Learning

The paper introduces PhishEye, a fully dynamic self-supervised system that models Ethereum transactions as a heterogeneous temporal attributed multi-graph and uses temporal graph contrastive learning to achieve high accuracy in detecting phishing activities.

AFL-ICP: Enhancing Industrial Control Protocol Reliability via Specification-Guided Fuzzing

AFL-ICP is a novel specification-driven fuzzing framework that significantly enhances the security testing of industrial control protocols by detecting subtle semantic and logic bugs missed by traditional fuzzers.

From AI-Generated Content to Agentic Action: Security and Safety Threats in Generative AI

The paper analyzes the escalating security and safety threats posed by generative AI systems as they transition from merely generating content to executing real-world actions via tools and agents, finding that current defenses lag behind capability deployment.

EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents

The paper introduces EgoBench, the first interactive multimodal benchmark designed to jointly evaluate advanced AI agents' capabilities in visual perception, multi-hop reasoning, and dynamic tool usage in real-world, egocentric scenarios.

DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation

DeepSurvey is an agentic system that significantly enhances automated survey generation by extracting deep, structured knowledge from full-text papers and rigorously validating citations, achieving superior content depth and reliability compared to existing methods.

Xetrieval: Mechanistically Explaining Dense Retrieval

Xetrieval introduces an embedding-level framework to mechanistically explain dense retrieval decisions by decomposing high-dimensional embeddings into sparse, human-interpretable features.

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

The paper introduces PassNet, a large-scale ecosystem for generating compiler passes using LLMs, demonstrating that LLMs can significantly accelerate graph compilation for long-tail workloads, suggesting that consistency is the primary bottleneck.

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

The paper proposes EKSFT, a selective fine-tuning method that masks high-entropy or high-KL divergence tokens during Supervised Fine-Tuning (SFT) to prevent distribution shift and improve subsequent Reinforcement Learning (RL) performance.

Configurable Reward Model for Balanced Safety Alignment

The paper introduces the Configurable Safety Reward Model (CSRM), a novel reward model that can be jointly optimized for calibrated safety compliance and reward modeling, significantly improving LLM safety alignment across diverse and unseen safety configurations.

A Unified and Reproducible Experimentation Framework for Speech Understanding

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

The paper introduces SAVE, a framework that uses on-policy feedback and the value function to self-supervise and improve reward models, significantly enhancing RLHF performance across multiple benchmarks.

I-WebGenBench : Evaluating Interactivity in LLM-Generated Scientific Web Applications

The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechanisms.

Sandboxed Coding Agents are Competitive Omni-modal Task Solvers

The paper demonstrates that specialized coding agents, using only text and image access within a sandbox, can effectively solve complex omnimodal tasks, often outperforming state-of-the-art native omnimodal models.

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgetting and enhancing robustness.

Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition

The paper proposes Self-Adaptive Monotonic Normalization (SAMN), a hyperparameter-friendly method that improves long-tailed recognition by enforcing monotonicity on per-class weight norms without requiring parameter regularization.

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validation and submission.

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

RedEdit: Agentic Red-Teaming of Image Safety Classifiers via MCTS-Guided Photo-Editing

The paper introduces RedEdit, an agentic red-teaming framework that demonstrates that malicious images can be easily edited to bypass safety classifiers while retaining their harmful semantics.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIRecentJun 4, 2026

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

View →
cs.CRRecentJun 4, 2026

RedEdit: Agentic Red-Teaming of Image Safety Classifiers via MCTS-Guided Photo-Editing

Weilin Lin, Ziqi Lin, Zhenxing Zhou, Jianze Li +3 more

The paper introduces RedEdit, an agentic red-teaming framework that demonstrates that malicious images can be easily edited to bypass safety classifiers while retaining their harmful semantics.

View →
cs.LGcs.AIRecentJun 1, 2026

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

Ran Liu, Min Yu, Mingqi Liu, Jianguo Jiang +6 more

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgett…

View →
cs.CVcs.AIRecentJun 1, 2026

Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition

Shuo Zhang, Chenqi Li, Tingting Zhu

The paper proposes Self-Adaptive Monotonic Normalization (SAMN), a hyperparameter-friendly method that improves long-tailed recognition by enforcing monotonicity on per-class weight norms without requ…

View →
cs.AIRecentJun 1, 2026

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Junqi Liu, Salena Song, Yuhan Wang, Jiawei Mao +11 more

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validatio…

View →
cs.CLRecentMay 30, 2026

I-WebGenBench : Evaluating Interactivity in LLM-Generated Scientific Web Applications

Dasen Dai, Biao Wu, Meng Fang, Shuoqi Li +1 more

The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechani…

View →
cs.CLcs.CVRecentMay 30, 2026

Sandboxed Coding Agents are Competitive Omni-modal Task Solvers

Dongping Chen, Xuanao Huang, Zhihan Hu, Qingyuan Shi +2 more

The paper demonstrates that specialized coding agents, using only text and image access within a sandbox, can effectively solve complex omnimodal tasks, often outperforming state-of-the-art native omn…

View →
eess.AScs.AIcs.SDRecentMay 29, 2026

A Unified and Reproducible Experimentation Framework for Speech Understanding

Jing Peng, Junhao Du, Chenghao Wang, Hanqi Li +20 more

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

View →
cs.CLRecentMay 29, 2026

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Xiaobo Wang, Tong Wu, Min Tang, Jiaqi Li +2 more

The paper introduces SAVE, a framework that uses on-policy feedback and the value function to self-supervise and improve reward models, significantly enhancing RLHF performance across multiple benchma…

View →
cs.AIRecentMay 28, 2026

DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation

Ziyue Yang, Da Ma, Hanqi Li, Zijian Wang +7 more

DeepSurvey is an agentic system that significantly enhances automated survey generation by extracting deep, structured knowledge from full-text papers and rigorously validating citations, achieving su…

View →
cs.AIcs.IRRecentMay 28, 2026

Xetrieval: Mechanistically Explaining Dense Retrieval

Zhixin Cai, Jun Bai, Yang Liu, Jiaqi Li +6 more

Xetrieval introduces an embedding-level framework to mechanistically explain dense retrieval decisions by decomposing high-dimensional embeddings into sparse, human-interpretable features.

View →
cs.AIcs.LGcs.PLRecentMay 28, 2026

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

Yiqun Liu, Yingsheng Wu, Ruqi Yang, Enrong Zheng +10 more

The paper introduces PassNet, a large-scale ecosystem for generating compiler passes using LLMs, demonstrating that LLMs can significantly accelerate graph compilation for long-tail workloads, suggest…

View →
cs.AIRecentMay 28, 2026

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Qi Liu, Mingdi Sun, Yongyi He, Zhi Zheng +4 more

The paper proposes EKSFT, a selective fine-tuning method that masks high-entropy or high-KL divergence tokens during Supervised Fine-Tuning (SFT) to prevent distribution shift and improve subsequent R…

View →
cs.CLRecentMay 28, 2026

Configurable Reward Model for Balanced Safety Alignment

Zhengping Jiang, Mehran Khodabandeh, Akash Bharadwaj, Manik Bhandari +4 more

The paper introduces the Configurable Safety Reward Model (CSRM), a novel reward model that can be jointly optimized for calibrated safety compliance and reward modeling, significantly improving LLM s…

View →
cs.AIRecentMay 27, 2026

EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents

Yunqi Liu, Tong Niu, Zitong Wang, Zhenlong Dai +3 more

The paper introduces EgoBench, the first interactive multimodal benchmark designed to jointly evaluate advanced AI agents' capabilities in visual perception, multi-hop reasoning, and dynamic tool usag…

View →
cs.CRRecentMay 15, 2026

From AI-Generated Content to Agentic Action: Security and Safety Threats in Generative AI

Zelin Zhang, Qi Li, Jie Cao, Lingshuang Liu +1 more

The paper analyzes the escalating security and safety threats posed by generative AI systems as they transition from merely generating content to executing real-world actions via tools and agents, fin…

View →
cs.CRcs.NIcs.SERecentMay 6, 2026

AFL-ICP: Enhancing Industrial Control Protocol Reliability via Specification-Guided Fuzzing

Jiaying Meng, Xuewei Feng, Qi Li, Min Liu +1 more

AFL-ICP is a novel specification-driven fuzzing framework that significantly enhances the security testing of industrial control protocols by detecting subtle semantic and logic bugs missed by traditi…

View →
cs.SDcs.CRRecentMay 2, 2026

MelShield: Robust Mel-Domain Audio Watermarking for Provenance Attribution of AI Generated Synthesized Speech

Yutong Jin, Qi Li, Lingshuang Liu, Jianbing Ni

MelShield is a robust, in-generation audio watermarking framework that embeds identifiable signals into AI-generated speech in the Mel-spectrogram domain for reliable copyright protection and attribut…

View →
cs.CRRecentMay 2, 2026

Phishing Detection in Ethereum via Temporal Graph Contrastive Learning

Cong Wu, Jing Chen, Siqi Lin, Hongda Li +1 more

The paper introduces PhishEye, a fully dynamic self-supervised system that models Ethereum transactions as a heterogeneous temporal attributed multi-graph and uses temporal graph contrastive learning…

View →
cs.CRcs.AIRecentApr 27, 2026

AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents

Yixiang Zhang, Xinhao Deng, Jiaqing Wu, Yue Xiao +2 more

The paper introduces AgentWard, a lifecycle-oriented, defense-in-depth architecture designed to systematically secure autonomous AI agents by protecting them across all stages of their operation.

View →