Virginia Smith

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×1Crypto×1

Frequent co-authors

Kevin Kuo1×

Chhavi Yadav1×

Research Timeline

2026

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks

This paper demonstrates that existing open-weight LLM safeguards are vulnerable to simple, non-gradient-based attacks like abliteration and prefilling, significantly increasing the attack success rate.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CRRecentMay 26, 2026

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks

Kevin Kuo, Chhavi Yadav, Virginia Smith

View →