Ziheng Zhou

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×2NLP×2Software Eng.×1ML×1Crypto×1

Frequent co-authors

Yangzhen Wu1×

Aaron J. Li1×

Wenjie Ma1×

Li Cao1×

Mert Cemri1×

Shu Liu1×

Research Timeline

2026

CREBench: Evaluating Large Language Models in Cryptographic Binary Reverse Engineering

The paper introduces CREBench, a comprehensive benchmark for evaluating Large Language Models (LLMs) on cryptographic binary reverse engineering, finding that while LLMs show promise, human experts still maintain a significant advantage.

BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution

BenchEvolver introduces a solution-centric evolutionary framework to automatically transform saturated coding benchmarks into significantly harder, high-quality, and diverse evaluation suites.

Highlighted terms show continued research focus across papers

Papers

cs.SEcs.AIcs.CLRecentMay 31, 2026

BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution

Yangzhen Wu, Aaron J. Li, Wenjie Ma, Li Cao +9 more

BenchEvolver introduces a solution-centric evolutionary framework to automatically transform saturated coding benchmarks into significantly harder, high-quality, and diverse evaluation suites.

View →

cs.CRcs.AIcs.CLRecentApr 4, 2026