Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Mayank Singh

Mayank Singh

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

NLP×1AI×1ML×1

Frequent co-authors

Himanshu Beniwal1×

Research Timeline

2026
Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models

The paper introduces retraining-free frameworks (Meow2X and TRNE) that mechanistically localize and suppress toxicity within language models by analyzing activation differences, achieving safety improvements without costly retraining.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIcs.LGRecentMay 27, 2026

Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models

Himanshu Beniwal, Mayank Singh

The paper introduces retraining-free frameworks (Meow2X and TRNE) that mechanistically localize and suppress toxicity within language models by analyzing activation differences, achieving safety impro…

View →