Pengjun Xie
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces FraudBench, a multimodal benchmark designed to detect AI-generated fraudulent refund evidence, finding that current AI models struggle significantly with claim-conditioned fake-damage detection.
This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.
Papers
CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding
Fuwei Zhang, Yanzhao Zhang, Mingxin Li, Dingkun Long +4 more
This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.