Junbin Yang

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×1NLP×1Crypto×1

Frequent co-authors

Wenhao Lan1×

Shan Li1×

Xinhua Lai1×

Meiqi Wu1×

Haihua Shen1×

Yijun Yang1×

Research Timeline

2026

Dynamic Adversarial Fine-Tuning Reorganizes Refusal Geometry

The paper investigates how dynamic adversarial fine-tuning (R2D2) reorganizes the internal mechanisms (refusal geometry) of safety-aligned language models, finding that it shifts the optimal refusal control carrier from late to early layers along a robustness-utility frontier.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CLcs.CRRecentApr 29, 2026

Dynamic Adversarial Fine-Tuning Reorganizes Refusal Geometry

Wenhao Lan, Shan Li, Xinhua Lai, Meiqi Wu +3 more

View →