Junlong Tong
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
ProactiveLLM introduces a novel framework that enables streaming LLMs to actively decide when to interact with incoming data by leveraging the model's internal states, significantly reducing latency while maintaining quality.
This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.
Papers
CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring
Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more
This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.