Weifang Zhang

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×1

Frequent co-authors

Yuzhou Nie1×

Bowen Pang1×

Guangrui Ma1×

Shining Wu1×

Research Timeline

2026

Threshold-Based Exclusive Batching for LLM Inference

This paper proposes a hybrid scheduler that dynamically switches between exclusive batching and mixed batching for LLM inference, achieving superior throughput, especially on bandwidth-constrained GPUs.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 30, 2026

Threshold-Based Exclusive Batching for LLM Inference

Weifang Zhang, Yuzhou Nie, Bowen Pang, Guangrui Ma +1 more

View →