Hao Cheng

5 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×4Vision×3Crypto×2ML×1NLP×1

Frequent co-authors

Changtao Miao2×

Tianle Song2×

Yin Wu2×

He Liu2×

Erjia Xiao2×

Junchi Chen2×

Research Timeline

2026

Simple Token-Efficient Vision-Language Model for Case-level Pathology Synoptic Report Generation

The paper introduces a simple, token-efficient vision-language model for generating comprehensive pathology synoptic reports from multiple whole-slide images (WSIs), achieving high performance while significantly reducing computational requirements.

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe actions rather than just the final outcome.

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

The paper introduces OpenWebRL, an open framework that enables training visual web agents using online multi-turn Reinforcement Learning directly on live websites, achieving state-of-the-art performance on challenging web benchmarks.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.

Highlighted terms show continued research focus across papers

Papers

cs.CVRecentJun 1, 2026

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Junhao Cheng, Liang Hou, Tianxiong Zhong, Xin Tao +3 more

The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.

View →

cs.CRcs.AIRecentJun 1, 2026