Soeun Kim

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×1NLP×1ML×1

Frequent co-authors

Albert No1×

Research Timeline

2026

Where Rollouts Begin: Low-Load, High-Leverage First-Token Diversification for RLVR

The paper introduces REFT, a novel method that diversifies rollouts by sampling the first token after the reasoning marker, significantly improving performance in Reinforcement Learning with Verifiable Rewards (RLVR) without altering the core RLVR pipeline.

Highlighted terms show continued research focus across papers

Papers

cs.AIcs.CLcs.LGRecentMay 27, 2026

Where Rollouts Begin: Low-Load, High-Leverage First-Token Diversification for RLVR

Soeun Kim, Albert No

View →