Ruilong Ren
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
NLP×1
Frequent co-authors
Research Timeline
2026
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism
The paper proposes Speculative Pipeline Decoding (SPD), a novel framework that uses pipeline parallelism to accelerate LLM inference by processing multiple tokens in parallel, achieving higher speedup and zero latency bubbles.
Highlighted terms show continued research focus across papers
Papers
cs.CLRecentMay 29, 2026
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism
Yijiong Yu, Huazheng Wang, Shuai Yuan, Ruilong Ren +1 more
The paper proposes Speculative Pipeline Decoding (SPD), a novel framework that uses pipeline parallelism to accelerate LLM inference by processing multiple tokens in parallel, achieving higher speedup…
View →