Yaxin Du
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
MIRA proposes a novel source-aware filtering framework that discovers and anchors evaluation rubrics during data selection, significantly improving code-oriented mid-training data quality while reducing token usage.
The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current state-of-the-art agents struggle with such personalized tool use.
Papers
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
Wenhao Wang, Peizhi Niu, Gongyi Zou, Xiyuan Yang +8 more
The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current…