Lei Sha
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces SkillBrew, a multi-objective framework that treats skill bank curation as a constrained optimization problem to build efficient and well-curated skill repositories for LLM agents.
The paper introduces TimeSage-MT, a comprehensive multi-turn benchmark designed to rigorously test an LLM agent's ability to perform complex, evolving time series analysis, revealing critical gaps in current agentic reasoning.
SafeSteer proposes a localized on-policy distillation method that restricts safety alignment to specific safety tokens, thereby achieving strong safety performance with minimal degradation to general capabilities and significantly reducing data requirements.
The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved expert utilization.
Papers
SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment
Hao Li, Jingkun An, Zijun Song, Pengyu Zhu +7 more
SafeSteer proposes a localized on-policy distillation method that restricts safety alignment to specific safety tokens, thereby achieving strong safety performance with minimal degradation to general…