Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Gjergji Kasneci

Gjergji Kasneci

2 indexed papers

Recent (6 mo)
2
With code
0
Influential cites
0
Benchmarked
0

Publications per year

2
26

Top categories

NLP×2ML×1Crypto×1

Frequent co-authors

Zheyu Zhang1×
Shuo Yang1×
Yuxiao Li1×
Alina Fastowski1×
Efstratios Zaradoukas1×
Bardh Prenkaj1×

Research Timeline

2026
Analysing the Safety Pitfalls of Steering Vectors

This paper systematically audits the safety implications of activation steering vectors, finding that these vectors significantly influence the success rate of jailbreak attacks by overlapping with latent refusal directions.

Consolidating Rewarded Perturbations for LLM Post-Training

The paper introduces CoRP, a gradient-free operator that consolidates the benefits of ensemble-based post-training methods into a single, deployable model update, significantly improving performance with minimal computational overhead.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.LGRecentMay 29, 2026

Consolidating Rewarded Perturbations for LLM Post-Training

Zheyu Zhang, Shuo Yang, Gjergji Kasneci

The paper introduces CoRP, a gradient-free operator that consolidates the benefits of ensemble-based post-training methods into a single, deployable model update, significantly improving performance w…

View →
cs.CRcs.CLRecentMar 25, 2026

Analysing the Safety Pitfalls of Steering Vectors

Yuxiao Li, Alina Fastowski, Efstratios Zaradoukas, Bardh Prenkaj +1 more

This paper systematically audits the safety implications of activation steering vectors, finding that these vectors significantly influence the success rate of jailbreak attacks by overlapping with la…

View →