Jing Li
13 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper introduces a novel backdoor attack (ACB) against Reinforcement Learning with Verifiable Rewards (RLVR), demonstrating that poisoning the training data can implant a backdoor that significantly degrades the LLM's safety performance.
The paper introduces BAIT, a three-step jailbreak framework that systematically forces large language models to disclose harmful information by leveraging their internal reasoning and consistency tendencies.
The paper introduces KTD-Fin, a novel benchmark that evaluates LLM trading agents by masking historical market data and decomposing returns, finding that LLM agents' profits are largely due to passive market exposure rather than genuine stock-selection alpha.
SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.
Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic tasks and embodiments.
The paper introduces Cookie-Bench, a novel, autonomous, and reference-free evaluation framework that significantly improves the assessment of interactive web generation capabilities for frontier LLMs.
RACE-Sched is an asynchronous agentic framework that successfully integrates low-latency, real-time scheduling decisions with advanced, long-horizon reasoning provided by Large Language Models.
The paper introduces VIABLE, the first benchmark for evaluating Vision-Language Models (VLMs) as judges for Visually Impaired Assistance (VIA), finding that current models are largely unreliable and proposing VIA-Judge-Agent to improve evaluation.
The paper provides a formal statistical and conceptual framework for defining and measuring 'pairwise reference alignment,' which quantifies how well a model's scoring function agrees with a given reference distribution of preferred and rejected response pairs.
WaveFilter is a novel, training-free framework that uses wavelet transforms to efficiently filter critical tokens in the KV cache, significantly improving the long-context performance of Diffusion LLMs.
The paper introduces RefMem-Bench, a new benchmark for measuring reflective memory in long-horizon dialogue, and proposes REMIND, a framework that significantly improves models' ability to synthesize fragmented cues into high-level interpretations.
The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.
This paper addresses the challenge of detecting and explaining AI-manipulated segments within long, untrimmed videos by proposing a new benchmark and a coarse-to-fine forensic detection framework.
Papers
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization
Junhao Cheng, Liang Hou, Tianxiong Zhong, Xin Tao +3 more
The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.