Haozhe Zhang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This study provides the first measurement of authentication security in real-world remote Model Context Protocol (MCP) servers, finding pervasive and critical authentication weaknesses, particularly in dynamic client registration.
The paper proposes Hysteretic Policy Optimization (HPO) and its adaptive variant (A-HPO) to stabilize reinforcement learning training in sparse-reward environments by better balancing positive and negative advantage updates.
Papers
HPO: Hysteretic Policy Optimization for Stable and Efficient Training under Sparse-Reward Regime
The paper proposes Hysteretic Policy Optimization (HPO) and its adaptive variant (A-HPO) to stabilize reinforcement learning training in sparse-reward environments by better balancing positive and neg…