Ming Cheng

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×3Sound×2Vision×1Multimedia×1

Frequent co-authors

Wanyi Ning1×

Wei Zhou1×

Yingpeng Li1×

Yinshang Guo1×

Haitao Qian1×

Yiming Cheng1×

Research Timeline

2026

MTAVG-Bench 2.0: Diagnosing Failure Modes of Cinematic Expressiveness in Multi-Talker Audio-Video Generation

The paper introduces MTAVG-Bench 2.0, a new benchmark designed to diagnose high-level failure modes of cinematic expressiveness in multi-talker audio-video generation, showing that even advanced models struggle with complex scene-level failures.

DiffCrossGait: Trajectory-Level Alignment for 2D-3D Cross-Modal Gait Recognition via Latent Diffusion

DiffCrossGait proposes a novel trajectory-level alignment method using latent diffusion to overcome domain discrepancies in 2D-3D gait recognition, achieving state-of-the-art performance.

PS4: Proxy-Supervised Joint Training for Real Target Speaker Extraction

The paper introduces PS4, a framework for training target speaker extraction models using a large-scale corpus and proxy-supervised joint training strategy.

Highlighted terms show continued research focus across papers

Papers

cs.SDcs.AIEmpiricalRecentJul 9, 2026

PS4: Proxy-Supervised Joint Training for Real Target Speaker Extraction

Wanyi Ning, Wei Zhou, Yingpeng Li, Yinshang Guo +2 more

The paper introduces PS4, a framework for training target speaker extraction models using a large-scale corpus and proxy-supervised joint training strategy.

View →

cs.CVcs.AIRecentMay 29, 2026