Yuxiao Yang

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×1AI×1

Frequent co-authors

Weitong Zhang1×

Research Timeline

2026

Return-to-Go Is More Than a Number: Q-Guided Alignment for Return-Conditioned Supervised Learning

The paper introduces Q-ALIGN DT, a novel framework that improves conditioned sequence models by enforcing alignment between the input return-to-go (RTG) signal and the output policy's expected Q-value, leading to superior policy controllability and performance.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIRecentMay 27, 2026

Return-to-Go Is More Than a Number: Q-Guided Alignment for Return-Conditioned Supervised Learning

Yuxiao Yang, Weitong Zhang

View →