Yao Huang

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×2AI×2NLP×2Crypto×2Software Eng.×1

Frequent co-authors

Yitong Sun1×

Teng Li1×

Ranjie Duan1×

Yichi Zhang1×

Xingjun Ma1×

Hui Xue1×

Research Timeline

2026

TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories

The paper introduces TraceSafe-Bench, a comprehensive benchmark, and finds that securing LLM agents requires jointly optimizing for structural reasoning and safety alignment to mitigate risks during multi-step tool-use.

Label Leakage Attacks in Machine Unlearning: A Parameter and Inversion-Based Approach

This paper analyzes and proposes four novel attack methods—based on model parameters and model inversion—to demonstrate that existing machine unlearning techniques can inadvertently leak the categories of the forgotten data.

MESA: Improving MoE Safety Alignment via Decentralized Expertise

MESA is a targeted alignment framework that decentralizes safety responsibilities across multiple experts in Mixture-of-Experts (MoE) LLMs using Optimal Transport theory, thereby improving safety robustness without sacrificing utility.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.CLRecentMay 30, 2026

MESA: Improving MoE Safety Alignment via Decentralized Expertise

Yitong Sun, Yao Huang, Teng Li, Ranjie Duan +4 more

View →

cs.CRcs.AIcs.CLRecentApr 8, 2026