Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Zicheng Li

Zicheng Li

4 indexed papers

Recent (6 mo)
4
With code
0
Influential cites
0
Benchmarked
0

Publications per year

4
26

Top categories

AI×4ML×2Vision×1

Frequent co-authors

Zicheng Liu2×
Emad Barsoum2×
Shivam Singh1×
Saptarshi Majumdar1×
Pratik Prabhanjan1×
Daize Dong1×

Research Timeline

2026
Better Accuracies, Worse Reasoning: A Step-Level Audit of Medical Chain-of-Thought Distillation

The paper demonstrates that while distilling large language models for medical QA can significantly improve final answer accuracy, this gain often comes at the cost of factual accuracy and detailed reasoning within the generated thought trace.

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training and rollout.

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

The paper proposes S2L-PO, a framework that uses smaller, naturally diverse models as structured explorers to enhance the policy-level diversity and performance of larger language models during training.

Pause and Think: A Dataset and Benchmark for Video-Grounded Assistive Action Suggestion

The paper introduces pause-and-think-T, a reasoning-centric dataset and benchmark that enables compact Vision-Language Models to perform visually grounded, context-aware action suggestion, matching large models like GPT-4o.

Highlighted terms show continued research focus across papers

Papers

cs.CVcs.AIRecentMay 30, 2026

Pause and Think: A Dataset and Benchmark for Video-Grounded Assistive Action Suggestion

Shivam Singh, Saptarshi Majumdar, Pratik Prabhanjan, Zicheng Liu +1 more

The paper introduces pause-and-think-T, a reasoning-centric dataset and benchmark that enables compact Vision-Language Models to perform visually grounded, context-aware action suggestion, matching la…

View →
cs.LGcs.AIRecentMay 29, 2026

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

Daize Dong, Junlin Chen, Haolong Jia, Jiawei Wu +8 more

The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training a…

View →
cs.LGcs.AIRecentMay 29, 2026

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

Yiming Ren, Yiran Xu, Zicheng Lin, Chufan Shi +7 more

The paper proposes S2L-PO, a framework that uses smaller, naturally diverse models as structured explorers to enhance the policy-level diversity and performance of larger language models during traini…

View →
cs.AIRecentMay 27, 2026

Better Accuracies, Worse Reasoning: A Step-Level Audit of Medical Chain-of-Thought Distillation

Zhaoyang Jiang, Xuanqi Peng, Fei Teng, Zhizhong Fu +4 more

The paper demonstrates that while distilling large language models for medical QA can significantly improve final answer accuracy, this gain often comes at the cost of factual accuracy and detailed re…

View →