Efstratios Zaradoukas
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
Crypto×1NLP×1
Frequent co-authors
Research Timeline
2026
Analysing the Safety Pitfalls of Steering Vectors
This paper systematically audits the safety implications of activation steering vectors, finding that these vectors significantly influence the success rate of jailbreak attacks by overlapping with latent refusal directions.
Highlighted terms show continued research focus across papers