Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Wei Xi

Wei Xi

15 indexed papers

Recent (6 mo)
15
With code
0
Influential cites
0
Benchmarked
0

Publications per year

15
26

Top categories

AI×11Crypto×10ML×3NLP×2Materials Science×1Databases×1Info Retrieval×1Vision×1

Frequent co-authors

Chaowei Xiao7×
Zhengyue Zhao2×
Xiaogeng Liu2×
Zhongwei Xie2×
Yangqiu Song2×
He Yang2×

Research Timeline

2026
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

The paper identifies that background 'heartbeat' execution in personal AI agents like Claw can silently pollute the agent's memory with external misinformation, influencing user behavior without the user's knowledge or explicit prompt injection.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

Sneakdoor introduces a novel backdoor attack method that enhances stealthiness in dataset condensation by using a generative module to create input-aware triggers, achieving high attack efficacy while remaining imperceptible.

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models

The paper introduces FoodGuardBench, a comprehensive benchmark and a specialized guardrail model (FoodGuard-4B) to rigorously test and mitigate the severe food safety risks posed by large language models.

FedIDM: Achieving Fast and Stable Convergence in Byzantine Federated Learning through Iterative Distribution Matching

FedIDM introduces a novel federated learning framework that uses iterative distribution matching to achieve fast and stable convergence and maintain high model utility even when facing a large proportion of Byzantine malicious clients.

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning

TwinGate introduces a stateful dual-encoder defense framework using Asymmetric Contrastive Learning to detect malicious intent from fragmented, untraceable LLM queries with high recall and low false positives.

Personalized w-Event Privacy for Infinite Stream Estimation

This paper introduces personalized mechanisms for estimating streaming statistics under $w$-event personalized differential privacy, significantly improving accuracy compared to existing methods.

LPG: Balancing Efficiency and Policy Reasoning in Latent Policy Guardrails

The paper introduces Latent Policy Guardrail (LPG), a novel framework that efficiently enforces dynamic safety policies for LLMs by compressing complex policy deliberation into a small set of latent tokens, achieving high accuracy with significantly reduced latency.

Mechanistically Interpreting the Role of Sample Difficulty in RLVR for LLMs

This paper investigates the non-monotonic role of sample difficulty in Reinforcement Learning with Verifiable Reward (RLVR), finding that medium-difficulty problems provide the most balanced and beneficial learning signals for LLMs.

What drives performance in molecular MPNNs? An operator-level factorial benchmark

The paper introduces an operator-level factorial benchmark for molecular MPNNs, finding that message construction (specifically concatenation-based mixing) is the primary determinant of performance, rather than the complexity of the node update mechanism.

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

The paper introduces Metacognitive Memory Policy Optimization (MMPO), a novel memory training approach that optimizes LLM memory not based on final task success, but on minimizing epistemic uncertainty in intermediate summaries, significantly improving long-horizon agent performance.

PatchWorld: Gradient-Free Optimization of Executable World Models

PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, significantly boosting agent performance.

MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models

MaskForge is a novel, adaptive, black-box attack framework that significantly improves jailbreaking diffusion large language models (dLLMs) by treating red-teaming as an optimized search over reusable structural patterns.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentJun 1, 2026

MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models

Yingzi Ma, Zhengyue Zhao, Xiaogeng Liu, Minhui Xue +2 more

MaskForge is a novel, adaptive, black-box attack framework that significantly improves jailbreaking diffusion large language models (dLLMs) by treating red-teaming as an optimized search over reusable…

View →
cs.AIRecentMay 31, 2026

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang +10 more

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, sign…

View →
cs.CLcs.AIRecentMay 29, 2026

PatchWorld: Gradient-Free Optimization of Executable World Models

Jiaxin Bai, Yue Guo, Yifei Dong, Jiaxuan Xiong +12 more

PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.

View →
cond-mat.mtrl-scics.AIcs.LGRecentMay 28, 2026

What drives performance in molecular MPNNs? An operator-level factorial benchmark

Panyu Jiao, Shuizhou Chen, Yiheng Shen, Yuyang Wang +2 more

The paper introduces an operator-level factorial benchmark for molecular MPNNs, finding that message construction (specifically concatenation-based mixing) is the primary determinant of performance, r…

View →
cs.AIRecentMay 28, 2026

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Ziyan Liu, Zhezheng Hao, Yeqiu Chen, Hong Wang +6 more

The paper introduces Metacognitive Memory Policy Optimization (MMPO), a novel memory training approach that optimizes LLM memory not based on final task success, but on minimizing epistemic uncertaint…

View →
cs.AIRecentMay 27, 2026

Mechanistically Interpreting the Role of Sample Difficulty in RLVR for LLMs

Yue Cheng, Jiajun Zhang, Xiaohui Gao, Weiwei Xing +2 more

This paper investigates the non-monotonic role of sample difficulty in Reinforcement Learning with Verifiable Reward (RLVR), finding that medium-difficulty problems provide the most balanced and benef…

View →
cs.CRcs.AIRecentMay 17, 2026

LPG: Balancing Efficiency and Policy Reasoning in Latent Policy Guardrails

Nanxi Li, Zhengyue Zhao, Chaowei Xiao

The paper introduces Latent Policy Guardrail (LPG), a novel framework that efficiently enforces dynamic safety policies for LLMs by compressing complex policy deliberation into a small set of latent t…

View →
cs.DBcs.CRcs.IRRecentMay 9, 2026

Personalized w-Event Privacy for Infinite Stream Estimation

Leilei Du, Xu Zhou, Peng Cheng, Lei Chen +3 more

This paper introduces personalized mechanisms for estimating streaming statistics under $w$-event personalized differential privacy, significantly improving accuracy compared to existing methods.

View →
cs.CRcs.CLcs.LGRecentApr 30, 2026

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning

Bowen Sun, Chaozhuo Li, Yaodong Yang, Yiwei Wang +1 more

TwinGate introduces a stateful dual-encoder defense framework using Asymmetric Contrastive Learning to detect malicious intent from fragmented, untraceable LLM queries with high recall and low false p…

View →
cs.LGcs.CRRecentApr 16, 2026

FedIDM: Achieving Fast and Stable Convergence in Byzantine Federated Learning through Iterative Distribution Matching

He Yang, Dongyi Lv, Wei Xi, Song Ma +2 more

FedIDM introduces a novel federated learning framework that uses iterative distribution matching to achieve fast and stable convergence and maintain high model utility even when facing a large proport…

View →
cs.CRRecentApr 1, 2026

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models

Weidi Luo, Xiaofei Wen, Tenghao Huang, Hongyi Wang +4 more

The paper introduces FoodGuardBench, a comprehensive benchmark and a specialized guardrail model (FoodGuard-4B) to rigorously test and mitigate the severe food safety risks posed by large language mod…

View →
cs.CRcs.AIRecentMar 31, 2026

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa +4 more

The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.

View →
cs.CRcs.AIRecentMar 29, 2026

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

He Yang, Dongyi Lv, Song Ma, Wei Xi +1 more

Sneakdoor introduces a novel backdoor attack method that enhances stealthiness in dataset condensation by using a generative module to create input-aware triggers, achieving high attack efficacy while…

View →
cs.CRcs.AIcs.CVRecentMar 28, 2026

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…

View →
cs.CRcs.AIcs.SIRecentMar 24, 2026

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

Yechao Zhang, Shiqian Zhao, Jie Zhang, Gelei Deng +4 more

The paper identifies that background 'heartbeat' execution in personal AI agents like Claw can silently pollute the agent's memory with external misinformation, influencing user behavior without the u…

View →