Zicheng Liu
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training and rollout.
The paper introduces pause-and-think-T, a reasoning-centric dataset and benchmark that enables compact Vision-Language Models to perform visually grounded, context-aware action suggestion, matching large models like GPT-4o.
Papers
Pause and Think: A Dataset and Benchmark for Video-Grounded Assistive Action Suggestion
The paper introduces pause-and-think-T, a reasoning-centric dataset and benchmark that enables compact Vision-Language Models to perform visually grounded, context-aware action suggestion, matching la…