Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Weiming Zhang

Weiming Zhang

5 indexed papers

Recent (6 mo)
5
With code
0
Influential cites
0
Benchmarked
0

Publications per year

5
26

Top categories

Crypto×5AI×2

Frequent co-authors

Kejiang Chen3×
Xi Yang2×
Chang Liu2×
Haoran Li2×
Yangqiu Song2×
Benlong Wu2×

Research Timeline

2026
Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries

The paper introduces Jargon, a novel adversarial framework that exploits the vulnerability of LLMs to context-specific safety boundary blurring, achieving high attack success rates across multiple frontier models.

VertMark: A Unified Training-Free Robust Watermarking Framework for Vertical Domain Pre-trained Language Models

VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specialized domains.

Provably Secure Agent Guardrail

The paper introduces a formal, logically constrained framework, ePCA, to secure advanced AI agents by forcing them to translate natural language intentions into first-order logical constraints before execution, achieving provably secure performance.

Provably Secure Agent Guardrail

The paper introduces an executable Proof-Constrained Action (ePCA) framework that secures AI agents by forcing them to formalize their intentions into first-order logical constraints, achieving provably secure operation.

Steering LLM Viewpoints through Fabricated Evidence Injection

This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.

Highlighted terms show continued research focus across papers

Papers

cs.CRRecentJun 4, 2026

Steering LLM Viewpoints through Fabricated Evidence Injection

Xi Yang, Chang Liu, Zhenglin Huang, Haoran Li +3 more

This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.

View →
cs.AIcs.CRRecentMay 28, 2026

Provably Secure Agent Guardrail

Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more

The paper introduces a formal, logically constrained framework, ePCA, to secure advanced AI agents by forcing them to translate natural language intentions into first-order logical constraints before…

View →
cs.AIcs.CRRecentMay 28, 2026

Provably Secure Agent Guardrail

Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more

The paper introduces an executable Proof-Constrained Action (ePCA) framework that secures AI agents by forcing them to formalize their intentions into first-order logical constraints, achieving provab…

View →
cs.CRRecentMay 4, 2026

VertMark: A Unified Training-Free Robust Watermarking Framework for Vertical Domain Pre-trained Language Models

Cong Kong, Xin Cheng, Zhaoxia Yin, Shuai Li +2 more

VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specializ…

View →
cs.CRRecentApr 17, 2026

Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries

Ki Sen Hung, Xi Yang, Chang Liu, Haoran Li +6 more

The paper introduces Jargon, a novel adversarial framework that exploits the vulnerability of LLMs to context-specific safety boundary blurring, achieving high attack success rates across multiple fro…

View →