Xu Wang

14 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×9AI×7Vision×4NLP×3Sound×2Robotics×2Emerging Tech×2Audio and Speech Processing×1

Frequent co-authors

Yixu Wang5×

Xingjun Ma5×

Yu-Gang Jiang4×

Xin Wang3×

Yifan Ding3×

Ming Wen3×

Research Timeline

2026

PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

PlanTwin introduces a privacy-preserving architecture that allows cloud-hosted LLMs to plan over sensitive local environments by projecting the raw state into a sanitized, abstract digital twin.

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

ClawKeeper is a comprehensive, multi-layered security framework designed to mitigate critical vulnerabilities in autonomous agent runtimes like OpenClaw by enforcing protection across skills, plugins, and system state.

Clawed and Dangerous: Can We Trust Open Agentic Systems?

This paper systematizes the security challenges of open agentic systems, concluding that while attack characterization is mature, the field lacks robust guidelines for operational governance, memory integrity, and capability revocation.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

Defense against Poisoning Attacks under Shuffle-DP

The paper proposes the first general defense framework to make all union-preserving Differential Privacy (DP) protocols, specifically those based on shuffle-DP, resilient against poisoning attacks.

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

DarkLLM introduces a novel framework that uses a Large Language Model (LLM) to translate natural language instructions into flexible, latent adversarial attack vectors, demonstrating a systemic vulnerability across diverse foundation models.

Knowledge-Intensive Video Generation

The paper introduces Knowledge-Intensive Video Generation (KIVI) as a challenging benchmark for evaluating video models on factuality and practical usefulness, showing that current state-of-the-art systems still struggle to match human performance.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

BraveGuard is a self-evolving defense framework that improves the safety of computer-use agents by training guard models on open-world, multi-step threat trajectories rather than static benchmarks.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

BraveGuard is a self-evolving defense framework that significantly improves the safety monitoring of computer-use agents by generating guard model supervision from open-world threat discovery and realistic, multi-step execution trajectories.

SentGuard: Sentence-Level Streaming Guardrails for Large Language Models

SentGuard introduces a novel sentence-level streaming guardrail that operates in parallel with LLM generation, achieving high detection rates of unsafe content early in the response while maintaining low false-positive rates.

PS-MOT: Cultivating Instance Awareness from Point Seeds for Multi-Object Tracking

This paper introduces PS-Track, a hierarchical pipeline for point-supervised Multi-Object Tracking (PS-MOT), which addresses spatial ambiguity and identity drift through Temporal-Feedback Prompting (TFP), Point-Excited Wavelet Attention (PEWA), and Uncertainty-Guided Gaussian Learning (UGL).

Privacy Detective: A Narrative Game that Cultivates Student Developers' Privacy Awareness by Harnessing Legal Documents

The paper introduces Privacy Detective, a game that trains developers on privacy awareness using real-world legal documents.

Pushing the Frontier of Full-Song Generation: Hierarchical Autoregressive Planning Meets Flow-Matching Rendering

This paper introduces a unified framework for generating high-quality full-length music from lyrics, text descriptions, and musical attributes, consisting of a semantic-aware tokenizer, hybird-LM, FullDiT, and a two-level melody module.

TF-MossFormer: Integrating Convolution Gated Local-Global Attentions for Enhanced Time-Frequency Domain Monaural Speech Separation

This paper proposes TF-MossFormer, a time-frequency transformer for monaural speech separation that combines local and global attention using a content-aware sliding-window mechanism.

Highlighted terms show continued research focus across papers

Papers

cs.SDEmpiricalRecentJul 23, 2026

TF-MossFormer: Integrating Convolution Gated Local-Global Attentions for Enhanced Time-Frequency Domain Monaural Speech Separation

Shengkui Zhao, Zexu Pan, Haoxu Wang, Biao Tian +2 more

This paper proposes TF-MossFormer, a time-frequency transformer for monaural speech separation that combines local and global attention using a content-aware sliding-window mechanism.

View →

cs.SDcs.AIeess.ASEmpirical