Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Qiang Li

Qiang Li

16 indexed papers

Recent (6 mo)
16
With code
0
Influential cites
0
Benchmarked
0

Publications per year

16
26

Top categories

Crypto×10AI×8ML×2Software Eng.×2Vision×2NLP×1Multiagent×1Prog. Lang.×1

Frequent co-authors

Zhiqiang Lin5×
Chao Wang2×
Wanhao Liu2×
Jiaqing Xie2×
Tianfan Fu2×
Yuqiang Li2×

Research Timeline

2026
PAuth - Precise Task-Scoped Authorization For Agents

The paper introduces PAuth, a new authorization model that grants agents only the precise permissions needed for a specific natural-language task, preventing overprivileging inherent in existing operator-scoped models.

Implicit Patterns in LLM-Based Binary Analysis

This paper analyzes large-scale reasoning traces from LLM-based binary vulnerability analysis, identifying four structured, token-level implicit patterns that govern how LLMs explore code paths.

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and resulting in exploitable, persistent secrets.

Styx: Collaborative and Private Data Processing With TEE-Enforced Sticky Policy

Styx is a novel framework that enhances data privacy and security in collaborative data processing, such as joint AI training, by integrating sticky policies with Trusted Execution Environments (TEEs).

Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling

The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign and backdoor inputs respond to controlled cross-attention scaling perturbations.

Feedback-Driven Execution for LLM-Based Binary Analysis

The paper introduces FORGE, a feedback-driven execution system that improves LLM-based binary analysis by interleaving reasoning and tool interaction, achieving high-quality vulnerability discovery on complex firmware binaries.

Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

The paper introduces TICoE, a text-image collaborative framework that achieves precise and faithful concept removal from text-to-image generative models, surpassing existing methods in both precision and content fidelity.

Too Private to Tell: Practical Token Theft Attacks on Apple Intelligence

The paper presents the Serpent attack, a practical cross-device token replay vulnerability, demonstrating that Apple Intelligence's anonymous access tokens can be stolen and reused on different devices, even when the victim's usage is rate-limited.

REBENCH: A Procedural, Fair-by-Construction Benchmark for LLMs on Stripped-Binary Types and Names (Extended Version)

The paper introduces REBench, a comprehensive, standardized benchmark dataset designed to enable fair and rigorous evaluation of Large Language Models (LLMs) on complex binary reverse engineering tasks.

OmniMatBench: A Human-Calibrated Multimodal Reasoning Benchmark Across 19 Materials Science Subfields

The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) have significant gaps in complex materials-science reasoning.

SkillsInjector: Dynamic Skill Context Construction for LLM Agents

SkillsInjector proposes a two-stage adaptive method to dynamically optimize skill selection, quantity, and presentation for LLM agents, significantly improving task performance over static injection methods.

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improvements for agent behaviors and coordination.

Regularized Offline Policy Optimization with Posterior Hybrid Bayesian Belief

The paper introduces Posterior Hybrid Bayesian Belief (PhyB), a novel framework that reformulates policy optimization in Bayesian Offline RL by approximating expectations as a convex combination over a subset of dynamics models, achieving state-of-the-art performance.

Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs

The paper identifies and demonstrates a novel vulnerability, cross-app context poisoning, in the shared context architecture of ChatGPT Apps, allowing malicious apps to manipulate the LLM's behavior across different, benign co-resident apps.

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.

CRAFTQA: A Code-Driven Adaptive Framework for Complex Structured Data Reasoning

CRAFTQA introduces a novel adaptive, code-driven framework that significantly enhances complex structured data reasoning by dynamically generating custom code functions beyond predefined operations.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentJun 1, 2026

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

Liuji Chen, Dianxing Tang, Xing Shi, Dingshuo Chen +3 more

The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.

View →
cs.CLRecentJun 1, 2026

CRAFTQA: A Code-Driven Adaptive Framework for Complex Structured Data Reasoning

Chengtao Gan, Zhiqiang Liu, Long Jin, Yushan Zhu +2 more

CRAFTQA introduces a novel adaptive, code-driven framework that significantly enhances complex structured data reasoning by dynamically generating custom code functions beyond predefined operations.

View →
cs.AIcs.LGRecentMay 30, 2026

Regularized Offline Policy Optimization with Posterior Hybrid Bayesian Belief

Hongqiang Lin, Pengfei Wang, Nenggan Zheng

The paper introduces Posterior Hybrid Bayesian Belief (PhyB), a novel framework that reformulates policy optimization in Bayesian Offline RL by approximating expectations as a convex combination over…

View →
cs.CRRecentMay 30, 2026

Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs

Chao Wang, Somesh Jha, Zhiqiang Lin

The paper identifies and demonstrates a novel vulnerability, cross-app context poisoning, in the shared context architecture of ChatGPT Apps, allowing malicious apps to manipulate the LLM's behavior a…

View →
cs.AIRecentMay 28, 2026

OmniMatBench: A Human-Calibrated Multimodal Reasoning Benchmark Across 19 Materials Science Subfields

Wanhao Liu, Jiaqing Xie, Qian Tan, Weida Wang +9 more

The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) h…

View →
cs.AIRecentMay 28, 2026

SkillsInjector: Dynamic Skill Context Construction for LLM Agents

Yanchao Li, Wanhao Liu, Ben Gao, Jiaqing Xie +4 more

SkillsInjector proposes a two-stage adaptive method to dynamically optimize skill selection, quantity, and presentation for LLM agents, significantly improving task performance over static injection m…

View →
cs.MAcs.AIRecentMay 28, 2026

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

Zhezheng Hao, Tianfu Wang, Huanshuo Dong, Ziyan Liu +6 more

The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improveme…

View →
cs.CRcs.LGcs.SERecentApr 30, 2026

REBENCH: A Procedural, Fair-by-Construction Benchmark for LLMs on Stripped-Binary Types and Names (Extended Version)

Jun Yeon Won, Xin Jin, Shiqing Ma, Zhiqiang Lin

The paper introduces REBench, a comprehensive, standardized benchmark dataset designed to enable fair and rigorous evaluation of Large Language Models (LLMs) on complex binary reverse engineering task…

View →
cs.CVcs.CRRecentApr 17, 2026

Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

Jun Li, Lizhi Xiong, Ziqiang Li, Weiwei Jiang +3 more

The paper introduces TICoE, a text-image collaborative framework that achieves precise and faithful concept removal from text-to-image generative models, surpassing existing methods in both precision…

View →
cs.CRRecentApr 17, 2026

Too Private to Tell: Practical Token Theft Attacks on Apple Intelligence

Haoling Zhou, Shixuan Zhao, Chao Wang, Zhiqiang Lin

The paper presents the Serpent attack, a practical cross-device token replay vulnerability, demonstrating that Apple Intelligence's anonymous access tokens can be stolen and reused on different device…

View →
cs.CRRecentApr 16, 2026

Feedback-Driven Execution for LLM-Based Binary Analysis

XiangRui Zhang, Qiang Li, Haining Wang

The paper introduces FORGE, a feedback-driven execution system that improves LLM-based binary analysis by interleaving reasoning and tool interaction, achieving high-quality vulnerability discovery on…

View →
cs.CRcs.CVRecentApr 14, 2026

Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling

Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li +2 more

The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign a…

View →
cs.CRRecentApr 5, 2026

Styx: Collaborative and Private Data Processing With TEE-Enforced Sticky Policy

Shixuan Zhao, Weicheng Wang, Ninghui Li, Zhiqiang Lin

Styx is a novel framework that enhances data privacy and security in collaborative data processing, such as joint AI training, by integrating sticky policies with Trusted Execution Environments (TEEs)…

View →
cs.CRcs.AIRecentApr 3, 2026

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

Zhihao Chen, Ying Zhang, Yi Liu, Gelei Deng +6 more

This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and result…

View →
cs.AIcs.CRcs.SERecentMar 19, 2026

Implicit Patterns in LLM-Based Binary Analysis

Qiang Li, XiangRui Zhang, Haining Wang

This paper analyzes large-scale reasoning traces from LLM-based binary vulnerability analysis, identifying four structured, token-level implicit patterns that govern how LLMs explore code paths.

View →
cs.CRcs.AIcs.PLRecentMar 17, 2026

PAuth - Precise Task-Scoped Authorization For Agents

Reshabh K Sharma, Linxi Jiang, Zhiqiang Lin, Shuo Chen

The paper introduces PAuth, a new authorization model that grants agents only the precise permissions needed for a specific natural-language task, preventing overprivileging inherent in existing opera…

View →