Hayden Helm
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
Crypto×1AI×1ML×1
Frequent co-authors
Research Timeline
2026
Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models
The paper introduces a framework using the 'behavioral geometry' of model populations to efficiently predict jailbreak susceptibility and transfer defenses, achieving high accuracy with significantly fewer evaluations.
Highlighted terms show continued research focus across papers