Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Yizhou Tian

Yizhou Tian

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

ML×1AI×1NLP×1

Frequent co-authors

Zizhe Chen1×
Jiqian Dong1×
Garry Yang1×
Yongqiang Chen1×
Zhitang Chen1×
James Cheng1×

Research Timeline

2026
Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning

This paper introduces Numca and Hista, two novel techniques that significantly improve state value estimation for LLM reinforcement learning, addressing the instability of standard critic approaches.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.CLRecentMay 28, 2026

Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning

Zizhe Chen, Jiqian Dong, Yizhou Tian, Garry Yang +3 more

This paper introduces Numca and Hista, two novel techniques that significantly improve state value estimation for LLM reinforcement learning, addressing the instability of standard critic approaches.

View →