Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Da Zhang

Da Zhang

3 indexed papers

Recent (6 mo)
3
With code
0
Influential cites
0
Benchmarked
0

Publications per year

3
26

Top categories

Vision×3AI×3

Frequent co-authors

Bingyu Li1×
Tao Huo1×
Zhiyuan Zhao1×
Junyu Gao1×
Xuelong Li1×
Chong Bao1×

Research Timeline

2026
SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.

Archon: A Unified Multimodal Model for Holistic Digital Human Generation

The paper introduces Archon, a unified, fully pretrained multimodal model that addresses the challenge of generating holistic digital humans by integrating seven modalities (including text, audio, motion, and visual content) into a single autoregressive framework.

An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation

The paper introduces Multi-temporal Referring Segmentation (MTRS), a new task requiring models to segment language-described temporal changes, and proposes MTRefSeg-R1, a specialized framework that achieves superior performance on the newly created MTRefSeg-21K benchmark.

Highlighted terms show continued research focus across papers

Papers

cs.CVcs.AIRecentMay 31, 2026

An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation

Bingyu Li, Da Zhang, Tao Huo, Zhiyuan Zhao +2 more

The paper introduces Multi-temporal Referring Segmentation (MTRS), a new task requiring models to segment language-described temporal changes, and proposes MTRefSeg-R1, a specialized framework that ac…

View →
cs.CVcs.AIRecentMay 28, 2026

Archon: A Unified Multimodal Model for Holistic Digital Human Generation

Chong Bao, Shichen Liu, Lijun Yu, David Futschik +8 more

The paper introduces Archon, a unified, fully pretrained multimodal model that addresses the challenge of generating holistic digital humans by integrating seven modalities (including text, audio, mot…

View →
cs.CVcs.AIRecentMay 27, 2026

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

Zhida Zhang, Jie Ma, Zhan Peng, Haoxue Wu +4 more

SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.

View →