Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Heng Yang

Heng Yang

7 indexed papers

Recent (6 mo)
7
With code
0
Influential cites
0
Benchmarked
0

Publications per year

7
26

Top categories

AI×5Vision×2NLP×2Robotics×1Audio and Speech Processing×1Software Eng.×1

Frequent co-authors

Cheng Yang2×
Rachel Luo1×
Michael Watson1×
Apoorva Sharma1×
Han Qi1×
Edward Schmerling1×

Research Timeline

2026
Selective QA over Conflicting Multi-Source Personal Memory: A Diagnostic Testbed and Method Comparison

The paper introduces a diagnostic benchmark for selective Question Answering over conflicting, multi-source personal memory, demonstrating that specialized fusion resolvers outperform general LLMs, especially when incorporating the ability to abstain.

Beyond Trajectory Rewards: Step-level Credit Assignment for Agentic Search via Graph Modeling

The paper introduces Graph-Distance Contribution Reward (GDCR) and Step Advantage Policy Optimization (SAPO) to provide fine-grained, step-level credit assignment for agentic search by modeling world knowledge as a latent graph.

ParaTool: Shifting Tool Representations from Context to Parameters

ParaTool introduces a novel framework that shifts tool representations from bulky context documentation to dedicated, loadable parameters, enabling efficient and robust tool calling in LLMs.

Do Text Edits Generalize to Visual Generation? Benchmarking Cross-Modal Knowledge Editing in UMMs

The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages and that direct audio processing is superior to cascaded ASR+LLM systems.

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models

STaR-KV introduces a novel, training-free KV cache compression framework that adaptively re-weights token importance across spatial, temporal, and distributional axes, significantly reducing GPU memory usage for GUI vision-language models while maintaining high accuracy.

X4Val: Learning Neural Surrogates for Variance-Reduced Policy Evaluation

This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.

Highlighted terms show continued research focus across papers

Papers

cs.RORecentJun 3, 2026

X4Val: Learning Neural Surrogates for Variance-Reduced Policy Evaluation

Rachel Luo, Michael Watson, Apoorva Sharma, Heng Yang +5 more

This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.

View →
cs.CVcs.AIRecentJun 1, 2026

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models

Yuhang Han, Wenzheng Yang, Yujie Chen, Xiangqi Jin +3 more

STaR-KV introduces a novel, training-free KV cache compression framework that adaptively re-weights token importance across spatial, temporal, and distributional axes, significantly reducing GPU memor…

View →
cs.CLcs.AIeess.ASRecentMay 31, 2026

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

Sicheng Yang, Shulan Ruan, Shiwei Wu, Yu Liu +3 more

PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages…

View →
cs.CLcs.CVRecentMay 30, 2026

Do Text Edits Generalize to Visual Generation? Benchmarking Cross-Modal Knowledge Editing in UMMs

Xin Gao, Cheng Yang, Chufan Shi, Taylor Berg-Kirkpatrick

The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.

View →
cs.AIRecentMay 28, 2026

Selective QA over Conflicting Multi-Source Personal Memory: A Diagnostic Testbed and Method Comparison

Tiancheng Yang, Matthias Schonlau, Ilia Sucholutsky

The paper introduces a diagnostic benchmark for selective Question Answering over conflicting, multi-source personal memory, demonstrating that specialized fusion resolvers outperform general LLMs, es…

View →
cs.AIRecentMay 28, 2026

Beyond Trajectory Rewards: Step-level Credit Assignment for Agentic Search via Graph Modeling

Yuchen Liu, Yingjie Feng, Lixiong Qin, Jiasi Chen +4 more

The paper introduces Graph-Distance Contribution Reward (GDCR) and Step Advantage Policy Optimization (SAPO) to provide fine-grained, step-level credit assignment for agentic search by modeling world…

View →
cs.AIcs.SERecentMay 28, 2026

ParaTool: Shifting Tool Representations from Context to Parameters

Zekai Yu, Qi Meng, Qizhi Chu, Yu Hao +2 more

ParaTool introduces a novel framework that shifts tool representations from bulky context documentation to dedicated, loadable parameters, enabling efficient and robust tool calling in LLMs.

View →