Ranjie Duan
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.
MESA is a targeted alignment framework that decentralizes safety responsibilities across multiple experts in Mixture-of-Experts (MoE) LLMs using Optimal Transport theory, thereby improving safety robustness without sacrificing utility.
Papers
MESA: Improving MoE Safety Alignment via Decentralized Expertise
Yitong Sun, Yao Huang, Teng Li, Ranjie Duan +4 more
MESA is a targeted alignment framework that decentralizes safety responsibilities across multiple experts in Mixture-of-Experts (MoE) LLMs using Optimal Transport theory, thereby improving safety robu…