Da Zhang

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Vision×3AI×3

Frequent co-authors

Bingyu Li1×

Tao Huo1×

Zhiyuan Zhao1×

Junyu Gao1×

Xuelong Li1×

Chong Bao1×

Research Timeline

2026

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.

Archon: A Unified Multimodal Model for Holistic Digital Human Generation

The paper introduces Archon, a unified, fully pretrained multimodal model that addresses the challenge of generating holistic digital humans by integrating seven modalities (including text, audio, motion, and visual content) into a single autoregressive framework.

An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation

The paper introduces Multi-temporal Referring Segmentation (MTRS), a new task requiring models to segment language-described temporal changes, and proposes MTRefSeg-R1, a specialized framework that achieves superior performance on the newly created MTRefSeg-21K benchmark.

Highlighted terms show continued research focus across papers

Papers

cs.CVcs.AIRecentMay 31, 2026

An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation

Bingyu Li, Da Zhang, Tao Huo, Zhiyuan Zhao +2 more

View →

cs.CVcs.AIRecentMay 28, 2026