Heng Yang
7 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces a diagnostic benchmark for selective Question Answering over conflicting, multi-source personal memory, demonstrating that specialized fusion resolvers outperform general LLMs, especially when incorporating the ability to abstain.
The paper introduces Graph-Distance Contribution Reward (GDCR) and Step Advantage Policy Optimization (SAPO) to provide fine-grained, step-level credit assignment for agentic search by modeling world knowledge as a latent graph.
ParaTool introduces a novel framework that shifts tool representations from bulky context documentation to dedicated, loadable parameters, enabling efficient and robust tool calling in LLMs.
The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.
PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages and that direct audio processing is superior to cascaded ASR+LLM systems.
STaR-KV introduces a novel, training-free KV cache compression framework that adaptively re-weights token importance across spatial, temporal, and distributional axes, significantly reducing GPU memory usage for GUI vision-language models while maintaining high accuracy.
This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.
Papers
X4Val: Learning Neural Surrogates for Variance-Reduced Policy Evaluation
Rachel Luo, Michael Watson, Apoorva Sharma, Heng Yang +5 more
This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.