A Predictive Law for On-Policy Self-Distillation From World Feedback | ArxivCSExplorer