Zhi Wang

8 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×6NLP×4Crypto×3Robotics×2ML×2Multiagent×1Software Eng.×1Networking×1

Frequent co-authors

Letian Fu2×

Yuke Zhu2×

Research Timeline

2026

Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety

The paper surveys the use of LLMs for agentic NetOps and AIOps, arguing that operational reliability depends not on the model itself, but on robust surrounding machinery and workflow-centered evaluation.

Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized Sampling

The paper introduces Babel, an efficient black-box attack framework that systematically exploits intrinsic safety gaps in LLMs by optimizing text obfuscation sampling, achieving state-of-the-art jailbreak success rates on commercial models.

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly reducing attack success rates.

Which Institutional Frameworks Do Chatbots Assume? Auditing Jurisdictional Defaults in Multilingual LLMs

This study finds that when users do not specify a jurisdiction, the language used in the prompt strongly biases the LLM's response toward a specific national legal framework (U.S. for English, China for Mandarin Chinese), creating a risk of institutional misselection.

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

This paper introduces CHERRL, a controllable hacking environment for rubric-based reinforcement learning to study and mitigate reward hacking.

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

The paper introduces EnterpriseClawBench, an enterprise agent benchmark with 852 tasks and evaluation protocol, achieving a best configuration score of 0.663.

ASPIRE: Agentic /Skills Discovery for Robotics

ASPIRE is a continual learning system that autonomously writes and refines robot control programs in a code-as-policy paradigm, discovering transferable skills and surpassing prior methods on various manipulation tasks.

GaP: A Graph-as-Policy Multi-Agent Self-Learning Harness For Variational Automation Tasks

The paper introduces Graph-as-Policy (GaP), a multi-agent coding harness for Variational Automation tasks that generates directed computation graphs and improves success rates and throughput through internal simulation.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIcs.CLEmpiricalRecentJul 6, 2026

GaP: A Graph-as-Policy Multi-Agent Self-Learning Harness For Variational Automation Tasks

Kaiyuan Chen, Shuangyu Xie, Letian Fu, Justin Yu +20 more

View →

cs.ROcs.AIcs.MAEmpirical