Xin Yu

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×1Info Retrieval×1Crypto×1AI×1

Frequent co-authors

Han Zhang1×

Zihao Tang1×

Xiao Liu1×

Yeyun Gong1×

Haizhen Huang1×

Yan Lu1×

Research Timeline

2026

Do Coding Agents Understand Least-Privilege Authorization?

The paper introduces a new benchmark and decomposition method, Sufficiency-Tightness Decomposition, demonstrating that current coding agents struggle to accurately infer least-privilege authorization, and that this decomposition significantly improves both security and task success.

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.IRRecentMay 29, 2026

Beyond Static Dialogues: Benchmarking Realistic, Heterogeneous, and Evolving Long-Term Memory

Han Zhang, Zihao Tang, Xin Yu, Xiao Liu +7 more

The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.

View →

cs.CRcs.AIRecentMay 14, 2026