Rui Hu
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
SelfGrader proposes a lightweight, robust guardrail for detecting LLM jailbreaks by formulating the detection problem as a numerical grading task using anchored token-level logits, achieving strong performance across various benchmarks.
XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited token context.
The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up to 7.62%.
The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate social-alignment requirements.
Papers
EUDAIMONIA: Evaluating Undesirable Dynamics in AI
Jun Rui Huang, Wang Bill Zhu, Ziyi Liu, Nathanael Fast +2 more
The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate…