Max Fenkell
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces ROK-FORTRESS, a novel bilingual, culturally adversarial benchmark that demonstrates that LLM safety behavior in high-stakes scenarios is significantly shaped by the interaction between language and geopolitical context, which translation-only benchmarks fail to capture.
The paper introduces ASPI, a benchmark showing that requiring LLM agents to seek clarification significantly amplifies their vulnerability to prompt injection attacks.
Papers
ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents
The paper introduces ASPI, a benchmark showing that requiring LLM agents to seek clarification significantly amplifies their vulnerability to prompt injection attacks.