Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Hongyu He

Hongyu He

2 indexed papers

Recent (6 mo)
2
With code
0
Influential cites
0
Benchmarked
0

Publications per year

2
26

Top categories

AI×2ML×2

Frequent co-authors

Zhongyu He1×
Yuanfan Li1×
Fei Huang1×
Tianyu Chen1×
Siyuan Chen1×
Xingyang Li1×

Research Timeline

2026
Quotient DAGs for Off-Policy Evaluation:Forward-Flow Importance Sampling and Exact Slate Propensities

The paper introduces a quotient-DAG view to accurately estimate unordered slate propensities for off-policy evaluation, solving the nuisance variance and computational gap inherent in standard importance sampling for autoregressive recommenders.

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly improving performance on complex tasks without external skill generators.

Highlighted terms show continued research focus across papers

Papers

cs.AIcs.LGRecentJun 1, 2026

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

Zhongyu He, Yuanfan Li, Fei Huang, Tianyu Chen +8 more

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly impro…

View →
cs.LGcs.AIRecentMay 28, 2026

Quotient DAGs for Off-Policy Evaluation:Forward-Flow Importance Sampling and Exact Slate Propensities

Ziwen Xie, Shaowen Xiang, Hongyu He, Dianbo Liu

The paper introduces a quotient-DAG view to accurately estimate unordered slate propensities for off-policy evaluation, solving the nuisance variance and computational gap inherent in standard importa…

View →