Yilun Yao
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Harness-Bench, a diagnostic benchmark that measures how different system 'harnesses' affect LLM agent performance in realistic workflows, showing that agent capability must be reported at the model-harness configuration level.
ConMoE proposes a train-free method for compressing Mixture-of-Experts (MoE) models by consolidating the large expert pool into a smaller set of reusable prototypes and deterministically remapping all original expert calls to these prototypes.
Papers
ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression
Yilun Yao, Jiaming Pan, Elsie Dai, Peizhuang Cong +2 more
ConMoE proposes a train-free method for compressing Mixture-of-Experts (MoE) models by consolidating the large expert pool into a smaller set of reusable prototypes and deterministically remapping all…