Heng Yang

7 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×5Vision×2NLP×2Robotics×1Audio and Speech Processing×1Software Eng.×1

Frequent co-authors

Cheng Yang2×

Rachel Luo1×

Michael Watson1×

Apoorva Sharma1×

Han Qi1×

Edward Schmerling1×

Research Timeline

2026

Selective QA over Conflicting Multi-Source Personal Memory: A Diagnostic Testbed and Method Comparison

The paper introduces a diagnostic benchmark for selective Question Answering over conflicting, multi-source personal memory, demonstrating that specialized fusion resolvers outperform general LLMs, especially when incorporating the ability to abstain.

Beyond Trajectory Rewards: Step-level Credit Assignment for Agentic Search via Graph Modeling

The paper introduces Graph-Distance Contribution Reward (GDCR) and Step Advantage Policy Optimization (SAPO) to provide fine-grained, step-level credit assignment for agentic search by modeling world knowledge as a latent graph.

ParaTool: Shifting Tool Representations from Context to Parameters

ParaTool introduces a novel framework that shifts tool representations from bulky context documentation to dedicated, loadable parameters, enabling efficient and robust tool calling in LLMs.

Do Text Edits Generalize to Visual Generation? Benchmarking Cross-Modal Knowledge Editing in UMMs

The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages and that direct audio processing is superior to cascaded ASR+LLM systems.

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models

STaR-KV introduces a novel, training-free KV cache compression framework that adaptively re-weights token importance across spatial, temporal, and distributional axes, significantly reducing GPU memory usage for GUI vision-language models while maintaining high accuracy.

X4Val: Learning Neural Surrogates for Variance-Reduced Policy Evaluation

This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.

Highlighted terms show continued research focus across papers

Papers

cs.RORecentJun 3, 2026

X4Val: Learning Neural Surrogates for Variance-Reduced Policy Evaluation

Rachel Luo, Michael Watson, Apoorva Sharma, Heng Yang +5 more

This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.

View →

cs.CVcs.AIRecentJun 1, 2026