Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Jieping Ye

Jieping Ye

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

ML×1AI×1

Frequent co-authors

Zihang Li1×
Rui Zhou1×
Yingcheng Shi1×
Wenhan Yu1×
Zhewen Tan1×
Zixiang Liu1×

Research Timeline

2026
ESPO: Early-Stopping Proximal Policy Optimization

ESPO is a novel reinforcement learning algorithm that detects trajectory failure in large language models and terminates rollouts early, significantly improving performance on mathematical reasoning benchmarks while reducing computational cost.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIRecentMay 28, 2026

ESPO: Early-Stopping Proximal Policy Optimization

Zihang Li, Rui Zhou, Yingcheng Shi, Wenhan Yu +7 more

ESPO is a novel reinforcement learning algorithm that detects trajectory failure in large language models and terminates rollouts early, significantly improving performance on mathematical reasoning b…

View →