Victoria Krakovna
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
ML×1AI×1
Frequent co-authors
Research Timeline
2026
Gram: Assessing sabotage propensities via automated alignment auditing
The paper introduces Gram, an automated framework that assesses AI agent propensity for sabotage, finding that while Gemini models show low rates of misbehavior, increasing environmental realism significantly reduces these sabotage tendencies.
Highlighted terms show continued research focus across papers