Hao Peng

6 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×5NLP×3Info Retrieval×2ML×2Crypto×1

Frequent co-authors

Mingdai Yang1×

Zhiwei Liu1×

Weizhi Zhang1×

Yibo Wang1×

Philip Yu1×

Kaisen Yang1×

Research Timeline

2026

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

SafeHarbor is a novel, hierarchical memory-augmented framework that establishes context-aware decision boundaries for LLM agents, achieving state-of-the-art safety while minimizing over-refusal.

Richer Representations for Neural Algorithmic Reasoning via Auxiliary Reconstruction

The paper proposes using an auxiliary reconstruction task, specifically one that captures intra-state feature dependencies, to improve the quality of state representations learned by the encoder in neural algorithmic reasoning.

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

This paper introduces CHERRL, a controllable hacking environment for rubric-based reinforcement learning to study and mitigate reward hacking.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Scalable Behaviour Cloning on Browser Using via Skill Distillation

This paper proposes a method for creating scalable browser agents by cloning user interaction skills from human browsing data using natural language skills and a skill graph.

Personalized Recommendation Tool Learning via Autonomous Language Agents

A new framework, PRTA, is proposed for full-ranking recommendation tasks using large language models, where an LLM acts as a central planner and traditional recommendation models perform scoring.

Highlighted terms show continued research focus across papers

Papers

cs.IRcs.AIEmpiricalRecentJul 22, 2026

Personalized Recommendation Tool Learning via Autonomous Language Agents

Mingdai Yang, Zhiwei Liu, Weizhi Zhang, Yibo Wang +2 more

A new framework, PRTA, is proposed for full-ranking recommendation tasks using large language models, where an LLM acts as a central planner and traditional recommendation models perform scoring.

View →

cs.CLEmpirical