Dong Huang
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
Software Eng.×1Crypto×1
Frequent co-authors
Research Timeline
2026
SABER: Benchmarking Operational Safety of LLM Coding Agents in Stateful Project Workspaces
The paper introduces SABER, a new benchmark that evaluates the operational safety of LLM coding agents in complex, stateful project environments, finding that current models have a high rate of harmful safety violations.
Highlighted terms show continued research focus across papers
Papers
cs.SEcs.CRRecentMay 31, 2026
SABER: Benchmarking Operational Safety of LLM Coding Agents in Stateful Project Workspaces
Qi Hu, Yifeng Tang, Qinghua Wang, Lanyang Zhao +6 more
The paper introduces SABER, a new benchmark that evaluates the operational safety of LLM coding agents in complex, stateful project environments, finding that current models have a high rate of harmfu…
View →