Weitong Zhang

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×2AI×2

Frequent co-authors

Tianrun Yu1×

Fenglong Ma1×

Research Timeline

2026

Return-to-Go Is More Than a Number: Q-Guided Alignment for Return-Conditioned Supervised Learning

The paper introduces Q-ALIGN DT, a novel framework that improves conditioned sequence models by enforcing alignment between the input return-to-go (RTG) signal and the output policy's expected Q-value, leading to superior policy controllability and performance.

LARK: Learnability-Grounded Trajectory Selection for Efficient Reasoning Distillation

LARK introduces a novel learnability-grounded approach for selecting reasoning trajectories, significantly improving the efficiency of reasoning distillation by prioritizing trajectories that the student model can learn from.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIRecentMay 28, 2026

LARK: Learnability-Grounded Trajectory Selection for Efficient Reasoning Distillation

Tianrun Yu, Kaixiang Zhao, Chih-Chun Chen, Amanda Hughes +4 more

View →

cs.LGcs.AIRecentMay 27, 2026