Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Maria-Florina Balcan

Maria-Florina Balcan

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

Stats ML×1AI×1ML×1

Frequent co-authors

Rattana Pukdee1×
Pradeep Ravikumar1×

Research Timeline

2026
Reward Learning from Best-of-$N$ Preference Data: Targets, Tradeoffs, and Design Principles

This paper analyzes Best-of-$N$ preference data, deriving explicit reward targets for independent-reference variants and establishing design principles for choosing $N$ and the base distribution to optimize reward learning.

Highlighted terms show continued research focus across papers

Papers

stat.MLcs.AIcs.LGRecentMay 28, 2026

Reward Learning from Best-of-$N$ Preference Data: Targets, Tradeoffs, and Design Principles

Rattana Pukdee, Maria-Florina Balcan, Pradeep Ravikumar

This paper analyzes Best-of-$N$ preference data, deriving explicit reward targets for independent-reference variants and establishing design principles for choosing $N$ and the base distribution to op…

View →