Zeen Zhu

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×1AI×1

Frequent co-authors

Weiyang Guo1×

Zesheng Shi1×

Yuan Zhou1×

Min Zhang1×

Jing Li1×

Research Timeline

2026

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

This paper introduces a novel backdoor attack (ACB) against Reinforcement Learning with Verifiable Rewards (RLVR), demonstrating that poisoning the training data can implant a backdoor that significantly degrades the LLM's safety performance.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentApr 10, 2026

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

Weiyang Guo, Zesheng Shi, Zeen Zhu, Yuan Zhou +2 more

View →