Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Jiaxi Wen

Jiaxi Wen

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

ML×1AI×1Software Eng.×1

Frequent co-authors

Yongxi Zhou1×
Lai Yun Choi1×
Wenbo Ye1×

Research Timeline

2026
Accuracy, Stability, and Repeated-Run Reliability of Large Language Models on Deterministic Programming Tasks

The paper demonstrates that standard LLM evaluation metrics overestimate performance because they fail to account for the stability of outcomes, showing a significant gap between reported pass rates and actual retry-free coverage.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.SERecentMay 30, 2026

Accuracy, Stability, and Repeated-Run Reliability of Large Language Models on Deterministic Programming Tasks

Yongxi Zhou, Lai Yun Choi, Jiaxi Wen, Wenbo Ye

The paper demonstrates that standard LLM evaluation metrics overestimate performance because they fail to account for the stability of outcomes, showing a significant gap between reported pass rates a…

View →