Mayank Singh

1 indexed paper

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×1AI×1ML×1

Frequent co-authors

Himanshu Beniwal1×

Research Timeline

2026

Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models

The paper introduces retraining-free frameworks (Meow2X and TRNE) that mechanistically localize and suppress toxicity within language models by analyzing activation differences, achieving safety improvements without costly retraining.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIcs.LGRecentMay 27, 2026

Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models

Himanshu Beniwal, Mayank Singh

View →