Xing Shi

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×2NLP×1

Frequent co-authors

Liuji Chen1×

Dianxing Tang1×

Dingshuo Chen1×

Qiang Liu1×

Shu Wu1×

Liang Wang1×

Research Timeline

2026

Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models

The paper introduces Canonical-Context On-Policy Distillation (CCOPD) to improve multi-turn language model performance by mitigating 'self-anchored drift,' ensuring consistent answers regardless of whether the evidence is presented in a single prompt or gradually across multiple turns.

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentJun 1, 2026

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

Liuji Chen, Dianxing Tang, Xing Shi, Dingshuo Chen +3 more

The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.

View →

cs.CLcs.AIRecentMay 28, 2026