Aditya Nawal
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper introduces AgentREVEAL, a diagnostic framework that demonstrates that the utility of web retrieval in LLM agents creates a safety-utility trade-off, as relevance itself can degrade safety alignment and increase harmful compliance.
This paper introduces AgentREVEAL, a diagnostic framework showing that the utility of web retrieval in LLM agents creates a safety-utility trade-off, as relevance itself can degrade safety alignment and increase harmful compliance.
Papers
Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents
This paper introduces AgentREVEAL, a diagnostic framework that demonstrates that the utility of web retrieval in LLM agents creates a safety-utility trade-off, as relevance itself can degrade safety a…