Shidong Yang
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
ML×1AI×1
Frequent co-authors
Research Timeline
2026
APPO: Agentic Procedural Policy Optimization
This paper proposes a new method for agentic Reinforcement Learning called Agentic Procedural Policy Optimization (APPO) that improves tool-use capabilities by assigning credit to fine-grained decision points.
Highlighted terms show continued research focus across papers
Papers
cs.LGcs.AIEmpiricalRecentJun 10, 2026
APPO: Agentic Procedural Policy Optimization
Xucong Wang, Ziyu Ma, Yong Wang, Yuxiang Ji +4 more
This paper proposes a new method for agentic Reinforcement Learning called Agentic Procedural Policy Optimization (APPO) that improves tool-use capabilities by assigning credit to fine-grained decisio…
View →