Kaidi Xu

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×2AI×2

Frequent co-authors

Hao Cheng2×

Changtao Miao2×

Tianle Song2×

Yin Wu2×

He Liu2×

Erjia Xiao2×

Research Timeline

2026

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe actions rather than just the final outcome.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentJun 1, 2026

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

Hao Cheng, Changtao Miao, Tianle Song, Yin Wu +20 more

View →

cs.CRcs.AIRecentJun 1, 2026