Qiang Li
16 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces PAuth, a new authorization model that grants agents only the precise permissions needed for a specific natural-language task, preventing overprivileging inherent in existing operator-scoped models.
This paper analyzes large-scale reasoning traces from LLM-based binary vulnerability analysis, identifying four structured, token-level implicit patterns that govern how LLMs explore code paths.
This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and resulting in exploitable, persistent secrets.
Styx is a novel framework that enhances data privacy and security in collaborative data processing, such as joint AI training, by integrating sticky policies with Trusted Execution Environments (TEEs).
The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign and backdoor inputs respond to controlled cross-attention scaling perturbations.
The paper introduces FORGE, a feedback-driven execution system that improves LLM-based binary analysis by interleaving reasoning and tool interaction, achieving high-quality vulnerability discovery on complex firmware binaries.
The paper introduces TICoE, a text-image collaborative framework that achieves precise and faithful concept removal from text-to-image generative models, surpassing existing methods in both precision and content fidelity.
The paper presents the Serpent attack, a practical cross-device token replay vulnerability, demonstrating that Apple Intelligence's anonymous access tokens can be stolen and reused on different devices, even when the victim's usage is rate-limited.
The paper introduces REBench, a comprehensive, standardized benchmark dataset designed to enable fair and rigorous evaluation of Large Language Models (LLMs) on complex binary reverse engineering tasks.
The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) have significant gaps in complex materials-science reasoning.
SkillsInjector proposes a two-stage adaptive method to dynamically optimize skill selection, quantity, and presentation for LLM agents, significantly improving task performance over static injection methods.
The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improvements for agent behaviors and coordination.
The paper introduces Posterior Hybrid Bayesian Belief (PhyB), a novel framework that reformulates policy optimization in Bayesian Offline RL by approximating expectations as a convex combination over a subset of dynamics models, achieving state-of-the-art performance.
The paper identifies and demonstrates a novel vulnerability, cross-app context poisoning, in the shared context architecture of ChatGPT Apps, allowing malicious apps to manipulate the LLM's behavior across different, benign co-resident apps.
The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.
CRAFTQA introduces a novel adaptive, code-driven framework that significantly enhances complex structured data reasoning by dynamically generating custom code functions beyond predefined operations.
Papers
Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning
Liuji Chen, Dianxing Tang, Xing Shi, Dingshuo Chen +3 more
The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.