Zhan Qin

5 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×5AI×2Databases×1ML×1

Frequent co-authors

Kui Ren4×

Weiwei Qi2×

Tianhang Zheng2×

Churui Zeng1×

Kedong Xiu1×

Chaochao Lu1×

Research Timeline

2026

Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models

The paper proposes the Expected Safety Impact (ESI) framework to identify safety-critical parameters in LLMs, introducing targeted tuning methods (SET and SPA) to enhance safety and preserve alignment during model adaptation.

Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms

This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Semantic Camouflage (SSC) to maintain unlearnability.

Defense against Poisoning Attacks under Shuffle-DP

The paper proposes the first general defense framework to make all union-preserving Differential Privacy (DP) protocols, specifically those based on shuffle-DP, resilient against poisoning attacks.

EVA: Editing for Versatile Alignment against Jailbreaks

The paper proposes EVA, a novel framework that uses direct model editing to surgically correct specific neurons responsible for jailbreaking vulnerabilities in LLMs and VLMs, achieving robust safety alignment without performance degradation.

TRACE: Task-Aware Adaptive Self-Evolving Agentic Jailbreaking

The paper proposes TRACE, a novel agentic jailbreaking framework that successfully bypasses safety mechanisms of advanced LLM agents by decomposing malicious tasks and disguising harmful subtasks within task-aware, iteratively evolved scenarios.

Highlighted terms show continued research focus across papers

Papers

cs.CRRecentMay 29, 2026

TRACE: Task-Aware Adaptive Self-Evolving Agentic Jailbreaking

Churui Zeng, Weiwei Qi, Kedong Xiu, Tianhang Zheng +4 more

View →

cs.CRcs.AIRecentMay 14, 2026