Yunpu Ma
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
EchoRL proposes a lightweight module to exploit valuable learning signals from advantage-degenerated rollouts in Reinforcement Learning with Verifiable Rewards (RLVR), significantly improving LLM post-training performance.
ProactiveLLM introduces a novel framework that enables streaming LLMs to actively decide when to interact with incoming data by leveraging the model's internal states, significantly reducing latency while maintaining quality.
Papers
ProactiveLLM: Learning Active Interaction for Streaming Large Language Models
Junlong Tong, Yao Zhang, Anhao Zhao, Yingqi Fan +2 more
ProactiveLLM introduces a novel framework that enables streaming LLMs to actively decide when to interact with incoming data by leveraging the model's internal states, significantly reducing latency w…