Xiao Wang

4 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×3Crypto×2Vision×1NLP×1Info Retrieval×1ML×1

Frequent co-authors

HuiMing Fan1×

Zheng Chu1×

Qianyu Wang1×

Zhuoyao Wang1×

Ming Liu1×

Bing Qin1×

Research Timeline

2026

MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

MirageBackdoor introduces a novel, highly stealthy backdoor attack that forces Large Language Models to generate correct reasoning steps (Think Well) but output an incorrect final answer (Answer Wrong), bypassing existing detection methods.

Nautilus Compass: Black-box Persona Drift Detection for Production LLM Agents

Nautilus Compass is a novel, black-box agent memory layer that detects persona drift in production LLM coding agents by embedding and comparing raw conversation text, achieving strong performance without requiring model weights or calling an LLM at inference time.

LiveBrowseComp: Are Search Agents Searching, or Just Verifying What They Already Know?

The paper argues that current search agents often verify existing knowledge rather than genuinely searching, and introduces LiveBrowseComp, a new benchmark to measure true evidence-driven discovery.

Mining Multi-Modality Spatio-Temporal Cues for Video Important Person Identification

The paper introduces VIP-Net, a framework that leverages multi-modal spatio-temporal cues and a new dataset (Temporal-VIP) to accurately identify the most influential people in videos, overcoming the challenge of Temporal Importance Shift (TIS).

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 27, 2026

LiveBrowseComp: Are Search Agents Searching, or Just Verifying What They Already Know?

HuiMing Fan, Xiao Wang, Zheng Chu, Qianyu Wang +4 more

The paper argues that current search agents often verify existing knowledge rather than genuinely searching, and introduces LiveBrowseComp, a new benchmark to measure true evidence-driven discovery.

View →

cs.CVcs.AIRecentMay 27, 2026