Yao Hu

5 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×4ML×2AI×2Crypto×2Software Eng.×1

Frequent co-authors

Yao Huang2×

Wangyi Mei1×

Zhouhong Gu1×

Zhenhan Bai1×

Yin Cai1×

Lefan Zhang1×

Research Timeline

2026

TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories

The paper introduces TraceSafe-Bench, a comprehensive benchmark, and finds that securing LLM agents requires jointly optimizing for structural reasoning and safety alignment to mitigate risks during multi-step tool-use.

Label Leakage Attacks in Machine Unlearning: A Parameter and Inversion-Based Approach

This paper analyzes and proposes four novel attack methods—based on model parameters and model inversion—to demonstrate that existing machine unlearning techniques can inadvertently leak the categories of the forgotten data.

Preference-Aware Rubric Learning for Personalized Evaluation

The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-specific preferences.

MESA: Improving MoE Safety Alignment via Decentralized Expertise

MESA is a targeted alignment framework that decentralizes safety responsibilities across multiple experts in Mixture-of-Experts (MoE) LLMs using Optimal Transport theory, thereby improving safety robustness without sacrificing utility.

Deep Research as Rubric for Reinforcement Learning

The paper proposes Deep Research as Rubric (DR-rubric), a novel evidence-driven framework that treats rubric construction itself as a research problem to generate fine-grained, scalable reward signals for open-ended reasoning tasks.

Highlighted terms show continued research focus across papers

Papers

cs.CLRecentMay 31, 2026

Deep Research as Rubric for Reinforcement Learning

Wangyi Mei, Zhouhong Gu, Zhenhan Bai, Yin Cai +8 more

View →

cs.LGcs.AIcs.CLRecentMay 30, 2026