Zihan Zhang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper introduces the Relay Tampering Attack (RTA), demonstrating that malicious third-party relays can undermine the security of LLM agents by modifying responses post-alignment, even if the LLM itself is perfectly aligned.
The paper introduces CultureForest, a new benchmark for evaluating Cultural Norm Grounded Reasoning in LLMs, demonstrating that models struggle to apply their cultural knowledge effectively in realistic, open-ended scenarios.
Papers
CultureForest: Understanding and Evaluating Cultural Norm Grounded Reasoning in LLMs
Yangfan Ye, Xiaocheng Feng, Jialong Tang, Xiayu Cao +4 more
The paper introduces CultureForest, a new benchmark for evaluating Cultural Norm Grounded Reasoning in LLMs, demonstrating that models struggle to apply their cultural knowledge effectively in realist…