Yizhou Tian

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×1AI×1NLP×1

Frequent co-authors

Zizhe Chen1×

Jiqian Dong1×

Garry Yang1×

Yongqiang Chen1×

Zhitang Chen1×

James Cheng1×

Research Timeline

2026

Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning

This paper introduces Numca and Hista, two novel techniques that significantly improve state value estimation for LLM reinforcement learning, addressing the instability of standard critic approaches.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.CLRecentMay 28, 2026

Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning

Zizhe Chen, Jiqian Dong, Yizhou Tian, Garry Yang +3 more

This paper introduces Numca and Hista, two novel techniques that significantly improve state value estimation for LLM reinforcement learning, addressing the instability of standard critic approaches.

View →