Junchen Wan

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×1AI×1

Frequent co-authors

Yuming1×

Huang1×

Yao Liu1×

Lei Wang1×

Research Timeline

2026

Let the Results Speak: A Replication-First Paradigm for LLM Behavioral Benchmarking

The paper introduces a 'replication-first' paradigm for LLM behavioral benchmarking, demonstrating that this rigorous approach uncovers significant, non-obvious performance drops between successive model versions, such as a notable decline in advice-restraint for GPT-5.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIRecentMay 27, 2026

Let the Results Speak: A Replication-First Paradigm for LLM Behavioral Benchmarking

Yuming, Huang, Yao Liu, Lei Wang +1 more

View →