Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Subhadip Mitra

Subhadip Mitra

4 indexed papers

Recent (6 mo)
4
With code
0
Influential cites
0
Benchmarked
0

Publications per year

4
26

Top categories

Crypto×4NLP×4Emerging Tech×4ML×4Neural Computing×4

Research Timeline

2026
Cross-Generational Transfer of Adversarial Attacks Reveals Non-Monotonic Safety Alignment in LLMs

The study demonstrates that safety alignment in LLMs is non-monotonic across model generations, showing that Gemma 3 exhibits a significantly higher attack success rate than both its predecessor and successor.

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety

The paper introduces a quality-diversity evolutionary framework that discovers diverse, interpretable vulnerabilities in large language models by evolving attack strategies at the semantic level, revealing systematic, model-specific weaknesses.

Cross-Generational Transfer of Adversarial Attacks Reveals Non-Monotonic Safety Alignment in LLMs

The study demonstrates that LLM safety alignment is non-monotonic across model generations, showing that Gemma 3 exhibits unexpectedly high vulnerability to adversarial attacks compared to both its predecessors and successors.

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety

The paper introduces a quality-diversity evolutionary framework that evolves interpretable attack strategies, successfully discovering distinct and systematic vulnerabilities in major LLMs like GPT-4o-mini and Gemini.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.CLcs.ETRecentMay 30, 2026

Cross-Generational Transfer of Adversarial Attacks Reveals Non-Monotonic Safety Alignment in LLMs

Subhadip Mitra

The study demonstrates that safety alignment in LLMs is non-monotonic across model generations, showing that Gemma 3 exhibits a significantly higher attack success rate than both its predecessor and s…

View →
cs.CRcs.CLcs.ETRecentMay 30, 2026

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety

Subhadip Mitra

The paper introduces a quality-diversity evolutionary framework that discovers diverse, interpretable vulnerabilities in large language models by evolving attack strategies at the semantic level, reve…

View →
cs.CRcs.CLcs.ETRecentMay 30, 2026

Cross-Generational Transfer of Adversarial Attacks Reveals Non-Monotonic Safety Alignment in LLMs

Subhadip Mitra

The study demonstrates that LLM safety alignment is non-monotonic across model generations, showing that Gemma 3 exhibits unexpectedly high vulnerability to adversarial attacks compared to both its pr…

View →
cs.CRcs.CLcs.ETRecentMay 30, 2026

Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety

Subhadip Mitra

The paper introduces a quality-diversity evolutionary framework that evolves interpretable attack strategies, successfully discovering distinct and systematic vulnerabilities in major LLMs like GPT-4o…

View →