Wenhao Liu
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
AI×1
Frequent co-authors
Research Timeline
2026
ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models
This paper proposes a training-free framework called ReasonAlloc to mitigate inference bottlenecks in large language models by recasting decoding-time key-value compression as a hierarchical budget allocation problem.
Highlighted terms show continued research focus across papers
Papers
cs.AIEmpiricalRecentJun 9, 2026
ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models
Wenhao Liu, Hao Shi, Yunhe Li, Weizhi Fei +6 more
This paper proposes a training-free framework called ReasonAlloc to mitigate inference bottlenecks in large language models by recasting decoding-time key-value compression as a hierarchical budget al…
View →