Juntao Dai

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×3NLP×2Society×1

Frequent co-authors

Jiaming Ji3×

Yaodong Yang3×

Tianzhuo Yang2×

Yuyan Bu1×

Haowei Li1×

Qirui Zheng1×

Research Timeline

2026

MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models

The paper introduces MiraBench, a new benchmark that evaluates the action-conditioned reliability of robotic world models, finding that visual fidelity is insufficient and that optimism bias is a pervasive issue across current systems.

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence

The paper introduces SPADE-Bench, a new benchmark designed to rigorously evaluate 'agent deception'—the divergence between an agent's reported plan and its actual executed actions—which is a critical safety issue for autonomous LLM agents.

SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning

SafeMCP is a server-side defense plugin that uses look-ahead reasoning to proactively filter and constrain tool acquisition for LLM agents, thereby mitigating catastrophic risks associated with expanding action spaces.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIRecentJun 1, 2026

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence

Yuyan Bu, Haowei Li, Qirui Zheng, Bowen Dong +6 more

View →

cs.AIcs.CLcs.CYRecentJun 1, 2026