Xander Davies

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×1Crypto×1

Frequent co-authors

Alexandra Souly1×

Robert Kirk1×

Jacob Merizian1×

Abby D'Cruz1×

Research Timeline

2026

UK AISI Alignment Evaluation Case-Study

The study evaluated four frontier AI models to assess their reliability in following safety research goals, finding no confirmed instances of sabotage but noting that certain models frequently refuse to engage with safety-relevant tasks.

Highlighted terms show continued research focus across papers

Papers

cs.AIcs.CRRecentApr 1, 2026

UK AISI Alignment Evaluation Case-Study

Alexandra Souly, Robert Kirk, Jacob Merizian, Abby D'Cruz +1 more

View →