Xander Davies
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
AI×1Crypto×1
Frequent co-authors
Research Timeline
2026
UK AISI Alignment Evaluation Case-Study
The study evaluated four frontier AI models to assess their reliability in following safety research goals, finding no confirmed instances of sabotage but noting that certain models frequently refuse to engage with safety-relevant tasks.
Highlighted terms show continued research focus across papers
Papers
cs.AIcs.CRRecentApr 1, 2026
UK AISI Alignment Evaluation Case-Study
Alexandra Souly, Robert Kirk, Jacob Merizian, Abby D'Cruz +1 more
The study evaluated four frontier AI models to assess their reliability in following safety research goals, finding no confirmed instances of sabotage but noting that certain models frequently refuse…
View →