Mehran Khodabandeh

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×1

Frequent co-authors

Anqi Liu1×

Research Timeline

2026

Configurable Reward Model for Balanced Safety Alignment

The paper introduces the Configurable Safety Reward Model (CSRM), a novel reward model that can be jointly optimized for calibrated safety compliance and reward modeling, significantly improving LLM safety alignment across diverse and unseen safety configurations.

Highlighted terms show continued research focus across papers

Papers

cs.CLRecentMay 28, 2026

Configurable Reward Model for Balanced Safety Alignment

Zhengping Jiang, Mehran Khodabandeh, Akash Bharadwaj, Manik Bhandari +4 more

View →