Guy Van den Broeck
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved expert utilization.
The paper proposes a novel probabilistic globally constrained decoding (P-GCD) method that efficiently constructs proposals for locally constrained decoding, significantly improving convergence speed and performance compared to existing approaches.
Papers
ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts
The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved…