Haoming Xu
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces LongDS, a new benchmark for long-horizon, multi-turn data analysis, demonstrating that current AI agents struggle significantly with maintaining and updating complex analytical states over extended interactions.
The paper introduces Contextual Belief Management (CBM) to address how LLMs should manage accumulating information over long interactions, showing that reinforcement learning significantly improves belief state accuracy.
Papers
LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis
Kewei Xu, Xiaoben Lu, Shuofei Qiao, Zihan Ding +3 more
The paper introduces LongDS, a new benchmark for long-horizon, multi-turn data analysis, demonstrating that current AI agents struggle significantly with maintaining and updating complex analytical st…