Yan Gao

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×3ML×1

Frequent co-authors

Shihao Rao1×

Liang Li1×

Jiapeng Liu1×

Tong Lin1×

Bing Li1×

Xiyan Gao1×

Research Timeline

2026

Trust Region On-Policy Distillation

The paper introduces Trust Region On-Policy Distillation (TrOPD), a robust method that stabilizes the on-policy distillation of large language models by restricting training to regions where teacher supervision is reliable.

Deep Research as Rubric for Reinforcement Learning

The paper proposes Deep Research as Rubric (DR-rubric), a novel evidence-driven framework that treats rubric construction itself as a research problem to generate fine-grained, scalable reward signals for open-ended reasoning tasks.

What to Format and How: A Benchmark and Workflow Approach for Document Formatting

The paper introduces DocFormBench, a new benchmark for content-aware document formatting, and proposes DocFormFlow, a workflow that improves formatting accuracy and efficiency by decoupling target localization from modification execution.

Highlighted terms show continued research focus across papers

Papers

cs.CLRecentJun 1, 2026

What to Format and How: A Benchmark and Workflow Approach for Document Formatting

Shihao Rao, Liang Li, Jiapeng Liu, Tong Lin +5 more

View →

cs.LGcs.CLRecentMay 31, 2026