Aman Chadha
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces 5WBENCH, a new benchmark for causal unlearning, and proposes MAAT, a novel three-phase framework that achieves high forgetting and high retention specifically on complex 'Why'-type causal knowledge.
The paper introduces MENTIS, a geometry-first framework that measures how preference alignment structurally changes the internal computations of language models, finding that these changes are selective, depth-localized, and concept-dependent.
Papers
MENTIS: What Belief Changes Under Alignment? Measuring Multi-Scale Latent Torsion in Language Models
The paper introduces MENTIS, a geometry-first framework that measures how preference alignment structurally changes the internal computations of language models, finding that these changes are selecti…