Daniel Kuznetsov
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper introduces FreakOut-LLM, demonstrating that emotional context, specifically stress, significantly compromises the safety alignment of large language models, increasing jailbreak susceptibility.
The paper introduces Involuntary In-Context Learning (IICL), an effective few-shot pattern completion attack that can bypass safety alignments in large language models, achieving a 24.0% bypass rate against GPT-5.4.
Papers
Involuntary In-Context Learning: Exploiting Few-Shot Pattern Completion to Bypass Safety Alignment in GPT-5.4
The paper introduces Involuntary In-Context Learning (IICL), an effective few-shot pattern completion attack that can bypass safety alignments in large language models, achieving a 24.0% bypass rate a…