Hongliang He
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
NLP×1
Frequent co-authors
Research Timeline
2026
Cost-Aware Diffusion Draft Trees for Speculative Decoding
The paper introduces CaDDTree, a cost-aware method that optimizes token throughput by jointly selecting the tree structure and node budget for speculative decoding, outperforming existing methods like DDTree.
Highlighted terms show continued research focus across papers