Jiang Li
7 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces ACIArena, a unified and comprehensive evaluation framework designed to systematically test the robustness of Multi-Agent Systems against complex Agent Cascading Injection attacks.
The paper proposes a novel method to inject reliable, sustained backdoors into LLMs by compiling an activation steering vector into model weights, ensuring the backdoor only activates upon a specific trigger.
The paper proposes a unified geodesic framework that combines tangent-constrained priors with curvature regularization to improve the robustness of image segmentation, especially for complex shapes.
The paper introduces DiffErase, a black-box attack that effectively removes inaudible audio watermarks while preserving perceptual quality by utilizing diffusion models.
The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training and rollout.
The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.
The paper introduces Lookahead Group Reward (&) to combat Supervision Fidelity Decay (SFD) in on-policy distillation, significantly improving student model performance on long reasoning tasks.
Papers
PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning
Daize Dong, Junlin Chen, Haolong Jia, Jiawei Wu +8 more
The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training a…