Zihui Song
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Expected Value Alignment (EVA), a novel reward modeling procedure that allows continuous scoring of intermediate reasoning steps in formal mathematics verification while maintaining the discrete, textual output format of generative models.
Soft-NBCE introduces soft entropy-weighted chunk fusion to overcome the semantic fragmentation caused by hard chunk selection in long-context LLMs, significantly improving performance on multi-hop benchmarks.
Papers
Expected Value Alignment for Generative Reward Modeling in Formal Mathematics Verification
The paper introduces Expected Value Alignment (EVA), a novel reward modeling procedure that allows continuous scoring of intermediate reasoning steps in formal mathematics verification while maintaini…