Yang Zhou

4 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×4ML×2Vision×1NLP×1

Frequent co-authors

Jiakang Li1×

Guanyu Zhu1×

Can Jin1×

Chenxi Huang1×

Dexu Yu1×

Ronghao Chen1×

Research Timeline

2026

OISD: On-Policy Internal Self-Distillation of Language Models

The OISD framework improves language model reasoning by distilling on-policy predictive signals from the final output layer to intermediate representations, leading to substantial improvements on mathematical reasoning tasks.

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

The paper introduces eXTC, a novel framework that combines structured prompt optimization, knowledge distillation, and reinforcement learning to create a highly performant and fully interpretable text classifier.

COMPASS: Cognitive MCTS-Guided Process Alignment for Safe Search Agents

COMPASS introduces a Cognitive MCTS-Guided Process Alignment framework to ensure robust safety for LLM search agents by identifying and supervising risky intermediate steps in multi-step reasoning.

Latent Reward Steering: An Adaptive Inference-Time Framework that Implicitly Promotes Cognitive Behaviors in Reasoning LLMs

The paper introduces Latent Reward Steering (LRS), an adaptive inference-time framework that implicitly improves the reasoning ability of LLMs by guiding the model's internal latent states based on a reward signal derived from final answer correctness.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 30, 2026

Latent Reward Steering: An Adaptive Inference-Time Framework that Implicitly Promotes Cognitive Behaviors in Reasoning LLMs

Jiakang Li, Guanyu Zhu, Can Jin, Chenxi Huang +7 more

View →

cs.AIRecentMay 29, 2026