Anshuman Suri
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces BOA, a novel framework that measures agent safety by exhaustively searching the entire in-budget trajectory space, thereby identifying unsafe behaviors missed by traditional sampling methods.
The paper introduces PoisonForge, a comprehensive benchmark demonstrating that even a small number of targeted poisoned examples can significantly compromise the safety and reliability of instruction-tuned LLMs across various model sizes.
Papers
PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs
Luze Sun, Anshuman Suri, Harsh Chaudhari, Cristina Nita-Rotaru +1 more
The paper introduces PoisonForge, a comprehensive benchmark demonstrating that even a small number of targeted poisoned examples can significantly compromise the safety and reliability of instruction-…