Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Mingze Wu

Mingze Wu

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

AI×1

Frequent co-authors

Abhinav Anand1×
Shweta Verma1×
Mira Mezini1×

Research Timeline

2026
Efficient Post-training of LLMs for Code Generation With Offline Reinforcement Learning

This paper proposes using offline reinforcement learning (RL) as an efficient alternative to online RL for post-training code-generating LLMs, demonstrating its effectiveness, especially for smaller models and complex coding tasks.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 27, 2026

Efficient Post-training of LLMs for Code Generation With Offline Reinforcement Learning

Mingze Wu, Abhinav Anand, Shweta Verma, Mira Mezini

This paper proposes using offline reinforcement learning (RL) as an efficient alternative to online RL for post-training code-generating LLMs, demonstrating its effectiveness, especially for smaller m…

View →