Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Hayden Helm

Hayden Helm

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

Crypto×1AI×1ML×1

Frequent co-authors

Xiaodong Liu1×
Weiwei Yang1×

Research Timeline

2026
Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models

The paper introduces a framework using the 'behavioral geometry' of model populations to efficiently predict jailbreak susceptibility and transfer defenses, achieving high accuracy with significantly fewer evaluations.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIcs.LGRecentMay 26, 2026

Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models

Hayden Helm, Xiaodong Liu, Weiwei Yang

The paper introduces a framework using the 'behavioral geometry' of model populations to efficiently predict jailbreak susceptibility and transfer defenses, achieving high accuracy with significantly…

View →