Agam Goyal
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This study demonstrates that an LLM's assigned support role (e.g., Inform, Coach, Relate) significantly alters its safety profile and the types of risks it presents when assisting users in complex caregiving situations.
The paper demonstrates that increasing the toxicity of prompts significantly degrades the factual reliability of LLMs, a degradation linked to the selective amplification of perturbation-sensitive nodes within the model's internal circuits.
Papers
Toxic HallucinAItions: Perturbing Prompts and Tracing LLM Circuits
The paper demonstrates that increasing the toxicity of prompts significantly degrades the factual reliability of LLMs, a degradation linked to the selective amplification of perturbation-sensitive nod…