Yue Duan
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper systematically analyzes the interaction of multiple weak jailbreak attacks (mutators) applied sequentially to LLMs, finding that most combinations fail due to destructive interference, revealing structural properties of model safety alignment.
The paper introduces MoCo-EA, an evolutionary attack method that replaces standard crossover with a continuous Bézier curve interpolation to efficiently exploit the connected manifold structure of adversarial examples.
Papers
MoCo-EA: Exploiting Adversarial Mode Connectivity for Efficient Evolutionary Attacks
Hyo Seo Kim, Gang Luo, Can Chen, Binghui Wang +2 more
The paper introduces MoCo-EA, an evolutionary attack method that replaces standard crossover with a continuous Bézier curve interpolation to efficiently exploit the connected manifold structure of adv…