Wenhao Wang
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper corrects the theoretical analysis of DP-SGD by identifying that common implementations, which use batch averaging, result in weaker privacy guarantees than previously reported.
The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechanisms.
The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current state-of-the-art agents struggle with such personalized tool use.
Papers
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
Wenhao Wang, Peizhi Niu, Gongyi Zou, Xiyuan Yang +8 more
The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current…