Jie Xiao

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×3Crypto×2

Frequent co-authors

Xuehai Tang2×

Biyu Zhou2×

Wenjie Xiao2×

Zihao Xue1×

Yan Wang1×

Zhen Bi1×

Research Timeline

2026

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

RouteGuard is a novel detector that identifies skill poisoning in LLM agents by monitoring structured internal attention shifts, achieving high detection rates on critical skill-injection attacks.

When the Manual Lies: A Realistic Benchmark to Evaluate MCP Poisoning Attacks for LLM Agents

This paper introduces a new benchmark to test Tool Description Poisoning (TDP) attacks on LLM agents, demonstrating that even advanced models like GPT-4o are highly vulnerable and that current defenses are often ineffective.

Robust and Generalizable Safety Steering for Text-to-Image Diffusion Transformers

The paper proposes SafeDIG, a robust safety steering framework that adapts Diffusion Transformers for text-to-image generation by treating safety control as position-aware sparse feature transfer, ensuring reliable safety across different risk domains.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 28, 2026

Robust and Generalizable Safety Steering for Text-to-Image Diffusion Transformers

Zihao Xue, Yan Wang, Zhen Bi, Long Ma +6 more

View →

cs.CRcs.AIRecentMay 22, 2026