Xingjun Ma

9 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×7Crypto×7AI×5ML×4Vision×4Robotics×1

Frequent co-authors

Yixu Wang5×

Yu-Gang Jiang5×

Xin Wang3×

Yan Teng3×

Yifan Ding3×

Ming Wen3×

Research Timeline

2026

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

The paper introduces ML-Bench, a policy-grounded multilingual safety benchmark, and ML-Guard, a superior guardrail model that enables culturally and legally aligned safety assessment for LLMs across 14 languages.

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

DarkLLM introduces a novel framework that uses a Large Language Model (LLM) to translate natural language instructions into flexible, latent adversarial attack vectors, demonstrating a systemic vulnerability across diverse foundation models.

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.

MESA: Improving MoE Safety Alignment via Decentralized Expertise

MESA is a targeted alignment framework that decentralizes safety responsibilities across multiple experts in Mixture-of-Experts (MoE) LLMs using Optimal Transport theory, thereby improving safety robustness without sacrificing utility.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

BraveGuard is a self-evolving defense framework that improves the safety of computer-use agents by training guard models on open-world, multi-step threat trajectories rather than static benchmarks.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

BraveGuard is a self-evolving defense framework that significantly improves the safety monitoring of computer-use agents by generating guard model supervision from open-world threat discovery and realistic, multi-step execution trajectories.

SentGuard: Sentence-Level Streaming Guardrails for Large Language Models

SentGuard introduces a novel sentence-level streaming guardrail that operates in parallel with LLM generation, achieving high detection rates of unsafe content early in the response while maintaining low false-positive rates.

Highlighted terms show continued research focus across papers

Papers

cs.CLRecentJun 1, 2026

SentGuard: Sentence-Level Streaming Guardrails for Large Language Models

Jiaqi Yu, Xin Wang, Yixu Wang, Jie Li +3 more

View →

cs.CRcs.CLRecentMay 31, 2026