Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Zhi Wang

Zhi Wang

5 indexed papers

Recent (6 mo)
5
With code
0
Influential cites
0
Benchmarked
0

Publications per year

5
26

Top categories

AI×4Crypto×3NLP×2ML×1Networking×1

Frequent co-authors

Xuekang Wang1×
Zhuoyuan Hao1×
Shuo Hou1×
Hao Peng1×
Juanzi Li1×
Xiaozhi Wang1×

Research Timeline

2026
Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety

The paper surveys the use of LLMs for agentic NetOps and AIOps, arguing that operational reliability depends not on the model itself, but on robust surrounding machinery and workflow-centered evaluation.

Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized Sampling

The paper introduces Babel, an efficient black-box attack framework that systematically exploits intrinsic safety gaps in LLMs by optimizing text obfuscation sampling, achieving state-of-the-art jailbreak success rates on commercial models.

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly reducing attack success rates.

Which Institutional Frameworks Do Chatbots Assume? Auditing Jurisdictional Defaults in Multilingual LLMs

This study finds that when users do not specify a jurisdiction, the language used in the prompt strongly biases the LLM's response toward a specific national legal framework (U.S. for English, China for Mandarin Chinese), creating a risk of institutional misselection.

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

This paper introduces CHERRL, a controllable hacking environment for rubric-based reinforcement learning to study and mitigate reward hacking.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.CLRecentJun 3, 2026

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Xuekang Wang, Zhuoyuan Hao, Shuo Hou, Hao Peng +2 more

This paper introduces CHERRL, a controllable hacking environment for rubric-based reinforcement learning to study and mitigate reward hacking.

View →
cs.CLRecentMay 29, 2026

Which Institutional Frameworks Do Chatbots Assume? Auditing Jurisdictional Defaults in Multilingual LLMs

Zhizhi Wang, Harini Suresh

This study finds that when users do not specify a jurisdiction, the language used in the prompt strongly biases the LLM's response toward a specific national legal framework (U.S. for English, China f…

View →
cs.CRcs.AIRecentMay 26, 2026

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Zhe Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong +3 more

The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly…

View →
cs.CRcs.AIRecentMay 18, 2026

Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized Sampling

Ziwei Wang, Jing Chen, Ruichao Liang, Zhi Wang +5 more

The paper introduces Babel, an efficient black-box attack framework that systematically exploits intrinsic safety gaps in LLMs by optimizing text obfuscation sampling, achieving state-of-the-art jailb…

View →
cs.NIcs.AIcs.CRRecentMay 12, 2026

Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety

Muhammad Bilal, Jon Crowcroft, Ruizhi Wang, Xiaolong Xu +1 more

The paper surveys the use of LLMs for agentic NetOps and AIOps, arguing that operational reliability depends not on the model itself, but on robust surrounding machinery and workflow-centered evaluati…

View →