Ruqi Zhang
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
Software Eng.×1Crypto×1
Frequent co-authors
Research Timeline
2026
Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety
The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that existing benchmarks miss.
Highlighted terms show continued research focus across papers