Tenghao Huang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces FoodGuardBench, a comprehensive benchmark and a specialized guardrail model (FoodGuard-4B) to rigorously test and mitigate the severe food safety risks posed by large language models.
The paper introduces GTA, a scalable framework for generating realistic, multi-hop web-agent tasks with dense, executable trajectories, addressing the current lack of process-level supervision in web agent research.
Papers
GTA: Generating Long-Horizon Tasks for Web Agents at Scale
The paper introduces GTA, a scalable framework for generating realistic, multi-hop web-agent tasks with dense, executable trajectories, addressing the current lack of process-level supervision in web…