ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.29121· 20 results

cs.LGcs.AIcs.CLEmpiricalRecentJun 10, 2026

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

Songhao Wu, Ang Lv, Ruobing Xie, Yankai Lin

This paper proposes a new router redesign for Mixture-of-Experts models using Manifold Power Iteration to align router rows with the principal singular directions of associated experts.

View →
cs.AIcs.CRRecentMay 22, 2026

Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts

Md Nurul Absar Siddiky

The paper analyzes the routing behavior of Mixtral MoE under benign and harmful prompts using activation and gradient signals, finding that safety-relevant routing is subtle, depth-dependent, and dist…

View →
cs.LGcs.AIRecentJun 1, 2026

ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts

Heng Zhao, Zilei Shao, Guy Van den Broeck, Zhe Zeng

The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved…

View →
cs.CRRecentMay 6, 2026

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

Zekun Fei, Zihao Wang, Weijie Liu, Ruiqi He +3 more

Misrouter introduces an input-only adversarial framework to exploit the routing mechanisms of Mixture-of-Experts (MoE) LLMs, enabling unsafe behavior induction against remotely hosted, black-box servi…

View →
cs.LGcs.AIRecentMay 29, 2026

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

Daize Dong, Junlin Chen, Haolong Jia, Jiawei Wu +8 more

The paper proposes Predictive Routing Replay (PR2) to stabilize reinforcement learning on Mixture of Experts (MoE) LLMs by predicting and incorporating short-horizon router evolution during training a…

View →
cs.CLcs.AIRecentMay 27, 2026

Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models

Guanzhi Deng, Kuan Wu, Haibo Wang, Shing Yin Wong +2 more

The paper introduces RA-MoE, a novel fine-tuning framework that leverages the internal routing structure of Mixture-of-Experts (MoE) models to improve performance on multilingual downstream tasks by a…

View →
cs.LGcs.AIRecentMay 28, 2026

Graph-Conditioned Mixture of Graph Neural Network Experts for Traffic Forecasting

Amirhossein Ghaffari, Saeid Sheikhi, Ekaterina Gilman

The paper proposes GC-MoE, a graph-conditioned Mixture of Experts framework, to improve traffic forecasting by assigning personalized, specialized forecasting experts to individual road segments.

View →
cs.CRRecentApr 30, 2026

MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks

Jona te Lintelo, Lichao Wu, Marina Krček, Sengim Karayalçin +1 more

MASCing is a novel framework that enables flexible, non-retraining reconfiguration of Mixture-of-Experts (MoE) models for specific safety objectives by applying activation steering masks to control ex…

View →
cs.AIRecentMay 31, 2026

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…

View →
cs.CRcs.AIcs.CLRecentApr 16, 2026

Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu +1 more

The paper introduces R$^2$A, an adversarial attack that uses suffix optimization to mislead black-box LLM routers into consistently selecting expensive, high-capability models.

View →
cs.CRRecentMar 20, 2026

Constraint Migration: A Formal Theory of Throughput in AI Cybersecurity Pipelines

Surasak Phetmanee

The paper develops a formal theory to analyze how throughput changes in AI-enhanced cybersecurity pipelines when stage capacities are perturbed by multipliers.

View →
cs.LGcs.AIcs.CLRecentMay 30, 2026

MESA: Improving MoE Safety Alignment via Decentralized Expertise

Yitong Sun, Yao Huang, Teng Li, Ranjie Duan +4 more

MESA is a targeted alignment framework that decentralizes safety responsibilities across multiple experts in Mixture-of-Experts (MoE) LLMs using Optimal Transport theory, thereby improving safety robu…

View →
cs.MAcs.AIcs.CLRecentMay 30, 2026

Dynamic Coordination Strategy Selection for Enterprise Multi-Agent Systems

Thanh Luong Tuan

The paper evaluates dynamic coordination strategy selection for enterprise multi-agent systems, finding that a calibrated default routing approach is effective, even if a deterministic winner-selectio…

View →
cs.CRcs.NImath.NARecentMay 26, 2026

Shortest Path Problem with Subnormal Gaussian Fuzzy Costs

Hande Günay Akdemir, Murat Moran

This paper proposes a reliability-aware framework to solve the fuzzy shortest path problem in directed graphs, optimizing routes based not only on cost but also on the reliability of the associated fu…

View →
cs.LGcs.AIRecentJun 1, 2026

DOT-MoE: Differentiable Optimal Transport for MoEfication

Udbhav Bamba, Arnav Chavan, Aryamaan Thakur, Steve Teig +1 more

DOT-MoE introduces a novel framework that treats the decomposition of dense layers into Mixture of Experts (MoE) as a Differentiable Optimal Transport problem, achieving superior efficiency while pres…

View →
cs.LGcs.ARRecentJun 2, 2026

MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency

Saptarshi Mitra, Yifan Zhang, Rachid Karami, Phyo Pyae Moe Aung +4 more

MOSAIC is a novel scheduling framework that significantly accelerates Mixture-of-Agents (MoA) workloads by jointly optimizing expert placement and utilizing confidence-aware adaptive aggregation.

View →
cs.LGcs.AIcs.CLRecentMay 29, 2026

OrcaRouter: A Production-Oriented LLM Router with Hybrid Offline-Online Learning

Zhenghua Bao, Fengya Tian, Chris Zhang, Zhenjun Chen +2 more

OrcaRouter is a production-ready LLM router that uses a hybrid offline-online learning approach to efficiently select the best large language model for an incoming query, achieving high accuracy at lo…

View →
cs.LGcs.AIRecentMay 30, 2026

Interpretable Policy Distillation for Power Grid Topology Control

Aleksandra Dmitruka, Karlis Freivalds

This paper demonstrates that a complex deep reinforcement learning policy for power grid control can be successfully distilled into a lightweight, auditable decision tree and random forest surrogate t…

View →
cs.CRcs.ARcs.CLRecentMay 24, 2026

RouteScan: A Non-Intrusive Approach to Auditing MoE LLMs Safety via Expert Routing Telemetry

Bo Lv, Zhiheng Xu, KeDong Xiu, Ruyi Ding +3 more

RouteScan introduces a non-intrusive framework that audits the safety of Mixture-of-Experts (MoE) LLMs by analyzing low-level GPU expert routing telemetry, achieving high accuracy even on unseen harmf…

View →
cs.CRRecentMay 29, 2026

Inferring Routing-Layer Defense Mechanisms from Observable Behavior in OLSR-Based MANETs

Nadav Schweitzer, Kiril Danilchenko, Ariel Stulman

This paper demonstrates that a specific routing-layer defense mechanism in OLSR-based MANETs can be inferred from passively observable routing and control-plane behavior, even when the defense operate…

View →