OPD+: Rethinking the Advantage Design for On-Policy Distillation | ArxivCSExplorer