Joaquin Vanschoren
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Croissant Tasks, a declarative metadata format designed to achieve conceptual reproducibility in machine learning by abstracting problem specifications from brittle implementation details.
The paper introduces TimeSage-MT, a comprehensive multi-turn benchmark designed to rigorously test an LLM agent's ability to perform complex, evolving time series analysis, revealing critical gaps in current agentic reasoning.
Papers
TimeSage-MT: A Multi-Turn Benchmark for Evaluating Agentic Time Series Reasoning
Yaxuan Kong, Qingren Yao, Yuqi Nie, Yichen Li +6 more
The paper introduces TimeSage-MT, a comprehensive multi-turn benchmark designed to rigorously test an LLM agent's ability to perform complex, evolving time series analysis, revealing critical gaps in…