~ similar to 2605.30486· 20 results
Du Yin, Hao Xue, Arian Prabowo, Shuang Ao +1 more
The paper introduces EvoXXLTraffic, an ultra-large, sensor-evolving dataset that simulates real-world road network growth, demonstrating that existing state-of-the-art traffic forecasting models fail…
The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved…
Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more
The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…
TrafficMoE proposes a Disentangle-Filter-Aggregate (DFA) framework using sparse Mixture-of-Experts to improve encrypted traffic classification by separating header and payload features and adaptively…
The paper analyzes the routing behavior of Mixtral MoE under benign and harmful prompts using activation and gradient signals, finding that safety-relevant routing is subtle, depth-dependent, and dist…
Guanzhi Deng, Kuan Wu, Haibo Wang, Shing Yin Wong +2 more
The paper introduces RA-MoE, a novel fine-tuning framework that leverages the internal routing structure of Mixture-of-Experts (MoE) models to improve performance on multilingual downstream tasks by a…
This paper proposes a new router redesign for Mixture-of-Experts models using Manifold Power Iteration to align router rows with the principal singular directions of associated experts.
Udbhav Bamba, Arnav Chavan, Aryamaan Thakur, Steve Teig +1 more
DOT-MoE introduces a novel framework that treats the decomposition of dense layers into Mixture of Experts (MoE) as a Differentiable Optimal Transport problem, achieving superior efficiency while pres…
Zekun Fei, Zihao Wang, Weijie Liu, Ruiqi He +3 more
Misrouter introduces an input-only adversarial framework to exploit the routing mechanisms of Mixture-of-Experts (MoE) LLMs, enabling unsafe behavior induction against remotely hosted, black-box servi…
Bo Lv, Zhiheng Xu, KeDong Xiu, Ruyi Ding +3 more
RouteScan introduces a non-intrusive framework that audits the safety of Mixture-of-Experts (MoE) LLMs by analyzing low-level GPU expert routing telemetry, achieving high accuracy even on unseen harmf…
The paper introduces a framework for composing deep probabilistic models using five specific factor-graph primitives that guarantee closed-form variational inference, thereby preserving tractability i…
EnergyMamba proposes an uncertainty-aware, graph-enhanced selective state space model to significantly improve both the accuracy and reliability of energy consumption prediction by explicitly modeling…
Sicheng Feng, Zigeng Chen, Gongfan Fang, Xinyin Ma +1 more
dMoE proposes a block-level Mixture-of-Experts (MoE) framework for Diffusion Large Language Models (dLLMs) that aggregates token-level expert distributions into a unified block-level distribution, sig…
Daize Dong, Junlin Chen, Haolong Jia, Jiawei Wu +8 more
The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training a…
The paper presents an end-to-end system that translates high-level operator intents into low-level, safe routing constraints for LEO mega-constellations, achieving high accuracy and safety guarantees.
Zhiyao Xu, Aoxue Liu, Zhanjie Ding, Dan Zhao +2 more
The paper proposes Task-Aware Coactivation Grouping (TACG) to significantly reduce communication costs in multi-task MoE inference by grouping experts based on task-specific co-activation patterns, ou…
The paper introduces Graph Cascades, a mesoscopic rewiring technique that enhances Graph Neural Networks by promoting node pairs with strong multi-hop connections to direct edges, improving performanc…
The paper proposes CANGuard, a hybrid CNN-GRU-Attention deep learning model, to accurately detect sophisticated Denial-of-Service and spoofing attacks targeting critical in-vehicle CAN bus networks.
The paper introduces a genetic algorithm framework to calibrate complex urban traffic simulations using only sparse real-world traffic observations, eliminating the need for detailed employment data.
The paper introduces a new benchmark (BGTD) and a multimodal framework (mmTraffic) that enables explainable, evidence-grounded interpretation of encrypted network traffic using LLMs.