Zhanxing Zhu

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×1Crypto×1AI×1

Frequent co-authors

Bochen Lyu1×

Yiyang Jia1×

Xiaohao Cai1×

Yue Cheng1×

Jiajun Zhang1×

Xiaohui Gao1×

Research Timeline

2026

Mechanistically Interpreting the Role of Sample Difficulty in RLVR for LLMs

This paper investigates the non-monotonic role of sample difficulty in Reinforcement Learning with Verifiable Reward (RLVR), finding that medium-difficulty problems provide the most balanced and beneficial learning signals for LLMs.

When Autoregressive Consistency Hurts Safety Alignment

The paper argues that shallow safety alignment in LLMs is due to autoregressive consistency, a mechanism that allows small harmful inputs to redirect the model's generation to unsafe outputs, necessitating adversarial safety training.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CRRecentJun 2, 2026

When Autoregressive Consistency Hurts Safety Alignment

Bochen Lyu, Yiyang Jia, Xiaohao Cai, Zhanxing Zhu

View →

cs.AIRecentMay 27, 2026