Haoqing Wang

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×1NLP×1

Frequent co-authors

Xingrun Xing1×

Boyan Gao1×

Ziheng Li1×

Yehui Tang1×

Research Timeline

2026

Trust Region On-Policy Distillation

The paper introduces Trust Region On-Policy Distillation (TrOPD), a robust method that stabilizes the on-policy distillation of large language models by restricting training to regions where teacher supervision is reliable.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CLRecentMay 31, 2026

Trust Region On-Policy Distillation

Xingrun Xing, Haoqing Wang, Boyan Gao, Ziheng Li +1 more

View →