Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Zihao Wang

Zihao Wang

5 indexed papers

Recent (6 mo)
5
With code
0
Influential cites
0
Benchmarked
0

Publications per year

5
26

Top categories

Crypto×4NLP×1AI×1ML×1

Frequent co-authors

XiaoFeng Wang3×
Jiaxin Bai1×
Yue Guo1×
Yifei Dong1×
Jiaxuan Xiong1×
Tianshi Zheng1×

Research Timeline

2026
LocalAlign: Enabling Generalizable Prompt Injection Defense via Generation of Near-Target Adversarial Examples for Alignment Training

LocalAlign proposes a generalizable prompt injection defense by generating near-target adversarial examples, which enforces a tighter robustness boundary around the correct model response.

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

Misrouter introduces an input-only adversarial framework to exploit the routing mechanisms of Mixture-of-Experts (MoE) LLMs, enabling unsafe behavior induction against remotely hosted, black-box services.

DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models

The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust parameter subsets.

Rethinking Side-Channel Analysis: Automated Discovery and Analysis of Side-Channel Leakage with LLM-Assisted Agents

The paper introduces SCAgent, an automated framework that uses LLM-assisted agents to systematically discover, analyze, and assess side-channel leakage risks in complex systems like iOS, moving beyond manual and predefined analysis.

PatchWorld: Gradient-Free Optimization of Executable World Models

PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIRecentMay 29, 2026

PatchWorld: Gradient-Free Optimization of Executable World Models

Jiaxin Bai, Yue Guo, Yifei Dong, Jiaxuan Xiong +12 more

PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.

View →
cs.LGcs.CRRecentMay 17, 2026

DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models

Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more

The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…

View →
cs.CRRecentMay 17, 2026

Rethinking Side-Channel Analysis: Automated Discovery and Analysis of Side-Channel Leakage with LLM-Assisted Agents

Zhen Xu, Zihao Wang, Yuhua Sun, XiaoFeng Wang

The paper introduces SCAgent, an automated framework that uses LLM-assisted agents to systematically discover, analyze, and assess side-channel leakage risks in complex systems like iOS, moving beyond…

View →
cs.CRRecentMay 6, 2026

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

Zekun Fei, Zihao Wang, Weijie Liu, Ruiqi He +3 more

Misrouter introduces an input-only adversarial framework to exploit the routing mechanisms of Mixture-of-Experts (MoE) LLMs, enabling unsafe behavior induction against remotely hosted, black-box servi…

View →
cs.CRRecentMay 2, 2026

LocalAlign: Enabling Generalizable Prompt Injection Defense via Generation of Near-Target Adversarial Examples for Alignment Training

Yuyang Gong, Zihao Wang, Jiawei Liu, XiaoFeng Wang

LocalAlign proposes a generalizable prompt injection defense by generating near-target adversarial examples, which enforces a tighter robustness boundary around the correct model response.

View →