Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Jing Li

Jing Li

13 indexed papers

Recent (6 mo)
13
With code
0
Influential cites
0
Benchmarked
0

Publications per year

13
26

Top categories

AI×8NLP×6Vision×4Crypto×2ML×1Robotics×1Trading and Market Microstructure×1

Frequent co-authors

Jing Liu3×
Ruifeng Xu2×
Junhao Cheng1×
Liang Hou1×
Tianxiong Zhong1×
Xin Tao1×

Research Timeline

2026
Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

This paper introduces a novel backdoor attack (ACB) against Reinforcement Learning with Verifiable Rewards (RLVR), demonstrating that poisoning the training data can implant a backdoor that significantly degrades the LLM's safety performance.

BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning

The paper introduces BAIT, a three-step jailbreak framework that systematically forces large language models to disclose harmful information by leveraging their internal reasoning and consistency tendencies.

From Knowing to Doing: A Memory-Controlled Benchmark for LLM Trading Agents on Stock Markets

The paper introduces KTD-Fin, a novel benchmark that evaluates LLM trading agents by masking historical market data and decomposing returns, finding that LLM agents' profits are largely due to passive market exposure rather than genuine stock-selection alpha.

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic tasks and embodiments.

Cookie-Bench: Continuous On-screen Key Interaction Evaluation for Web Generation

The paper introduces Cookie-Bench, a novel, autonomous, and reference-free evaluation framework that significantly improves the assessment of interactive web generation capabilities for frontier LLMs.

Harmonizing Real-Time Constraints and Long-Horizon Reasoning: An Asynchronous Agentic Framework for Dynamic Scheduling

RACE-Sched is an asynchronous agentic framework that successfully integrates low-latency, real-time scheduling decisions with advanced, long-horizon reasoning provided by Large Language Models.

A Visually Impaired Assistance Benchmark for VLM-as-a-Judge Evaluation

The paper introduces VIABLE, the first benchmark for evaluating Vision-Language Models (VLMs) as judges for Visually Impaired Assistance (VIA), finding that current models are largely unreliable and proposing VIA-Judge-Agent to improve evaluation.

Pairwise Reference Alignment as a Model-Level Ordinal Observable

The paper provides a formal statistical and conceptual framework for defining and measuring 'pairwise reference alignment,' which quantifies how well a model's scoring function agrees with a given reference distribution of preferred and rejected response pairs.

WaveFilter: Enhancing the Long-Context Capability of Diffusion LLMs via Wavelet-Guided KV Cache Filtering

WaveFilter is a novel, training-free framework that uses wavelet transforms to efficiently filter critical tokens in the KV cache, significantly improving the long-context performance of Diffusion LLMs.

Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue

The paper introduces RefMem-Bench, a new benchmark for measuring reflective memory in long-horizon dialogue, and proposes REMIND, a framework that significantly improves models' ability to synthesize fragmented cues into high-level interpretations.

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.

Explainable Forensics of Manipulated Segments in Untrimmed Long Videos

This paper addresses the challenge of detecting and explaining AI-manipulated segments within long, untrimmed videos by proposing a new benchmark and a coarse-to-fine forensic detection framework.

Highlighted terms show continued research focus across papers

Papers

cs.CVRecentJun 1, 2026

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Junhao Cheng, Liang Hou, Tianxiong Zhong, Xin Tao +3 more

The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.

View →
cs.CVRecentJun 1, 2026

Explainable Forensics of Manipulated Segments in Untrimmed Long Videos

Yue Feng, Jingjing Li, Qijia Lu, Wei Ji +8 more

This paper addresses the challenge of detecting and explaining AI-manipulated segments within long, untrimmed videos by proposing a new benchmark and a coarse-to-fine forensic detection framework.

View →
cs.CLcs.AIRecentMay 31, 2026

Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue

Jingjie Lin, Bingbing Wang, Zihan Wang, Zhengda Jin +3 more

The paper introduces RefMem-Bench, a new benchmark for measuring reflective memory in long-horizon dialogue, and proposes REMIND, a framework that significantly improves models' ability to synthesize…

View →
cs.CLcs.AIRecentMay 30, 2026

WaveFilter: Enhancing the Long-Context Capability of Diffusion LLMs via Wavelet-Guided KV Cache Filtering

Jinnan Yang, Yan Wang, Zhen Bi, Kehao Wu +4 more

WaveFilter is a novel, training-free framework that uses wavelet transforms to efficiently filter critical tokens in the KV cache, significantly improving the long-context performance of Diffusion LLM…

View →
cs.CLcs.CVRecentMay 29, 2026

A Visually Impaired Assistance Benchmark for VLM-as-a-Judge Evaluation

Yi Zhao, Siqi Wang, Zhe Hu, Yushi Li +1 more

The paper introduces VIABLE, the first benchmark for evaluating Vision-Language Models (VLMs) as judges for Visually Impaired Assistance (VIA), finding that current models are largely unreliable and p…

View →
cs.CLcs.LGRecentMay 29, 2026

Pairwise Reference Alignment as a Model-Level Ordinal Observable

Mujing Li

The paper provides a formal statistical and conceptual framework for defining and measuring 'pairwise reference alignment,' which quantifies how well a model's scoring function agrees with a given ref…

View →
cs.ROcs.AIcs.CLRecentMay 28, 2026

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Qiuyue Wang, Mingsheng Li, Jian Guan, Jinhui Ye +36 more

Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic ta…

View →
cs.AIRecentMay 28, 2026

Cookie-Bench: Continuous On-screen Key Interaction Evaluation for Web Generation

Haoyue Yang, Zhangxiao Shen, Fan Ding, Hangting Lou +7 more

The paper introduces Cookie-Bench, a novel, autonomous, and reference-free evaluation framework that significantly improves the assessment of interactive web generation capabilities for frontier LLMs.

View →
cs.AIRecentMay 28, 2026

Harmonizing Real-Time Constraints and Long-Horizon Reasoning: An Asynchronous Agentic Framework for Dynamic Scheduling

Shijie Cao, Yuan Yuan, Jing Liu

RACE-Sched is an asynchronous agentic framework that successfully integrates low-latency, real-time scheduling decisions with advanced, long-horizon reasoning provided by Large Language Models.

View →
cs.AIq-fin.TRRecentMay 27, 2026

From Knowing to Doing: A Memory-Controlled Benchmark for LLM Trading Agents on Stock Markets

Taojie Zhu, Wentao Zhao, Rui Sun, Beidi Luan +6 more

The paper introduces KTD-Fin, a novel benchmark that evaluates LLM trading agents by masking historical market data and decomposing returns, finding that LLM agents' profits are largely due to passive…

View →
cs.CVcs.AIRecentMay 27, 2026

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

Zhida Zhang, Jie Ma, Zhan Peng, Haoxue Wu +4 more

SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.

View →
cs.CRcs.CLRecentMay 26, 2026

BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning

Xuan Luo, Yue Wang, Geng Tu, Jing Li +1 more

The paper introduces BAIT, a three-step jailbreak framework that systematically forces large language models to disclose harmful information by leveraging their internal reasoning and consistency tend…

View →
cs.CRcs.AIRecentApr 10, 2026

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

Weiyang Guo, Zesheng Shi, Zeen Zhu, Yuan Zhou +2 more

This paper introduces a novel backdoor attack (ACB) against Reinforcement Learning with Verifiable Rewards (RLVR), demonstrating that poisoning the training data can implant a backdoor that significan…

View →