Ao Cheng

7 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×5Vision×4Crypto×3ML×1NLP×1

Frequent co-authors

Hao Cheng3×

Changtao Miao2×

Tianle Song2×

Yin Wu2×

He Liu2×

Erjia Xiao2×

Research Timeline

2026

Demystifying and Detecting Agentic Workflow Injection Vulnerabilities in GitHub Actions

This paper introduces Agentic Workflow Injection (AWI), a new class of vulnerability in LLM-powered GitHub Actions, and presents TaintAWI, a novel taint-analysis tool that identifies hundreds of exploitable zero-day vulnerabilities.

StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning

The paper introduces StemBind, a diagnostic benchmark that separates perception, rule induction, and answer selection in abstract visual reasoning, revealing that the primary failure point for MLLMs is the mapping of the identified rule to the correct instance.

Simple Token-Efficient Vision-Language Model for Case-level Pathology Synoptic Report Generation

The paper introduces a simple, token-efficient vision-language model for generating comprehensive pathology synoptic reports from multiple whole-slide images (WSIs), achieving high performance while significantly reducing computational requirements.

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe actions rather than just the final outcome.

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

The paper introduces OpenWebRL, an open framework that enables training visual web agents using online multi-turn Reinforcement Learning directly on live websites, achieving state-of-the-art performance on challenging web benchmarks.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.

Highlighted terms show continued research focus across papers

Papers

cs.CVRecentJun 1, 2026

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Junhao Cheng, Liang Hou, Tianxiong Zhong, Xin Tao +3 more

The paper proposes using Vision-Language Models (VLMs) as 'teachers' to guide Video Generation Models (VGMs) during test-time optimization, significantly improving video reasoning capabilities.

View →

cs.CRcs.AIRecentJun 1, 2026