Zikai Zhang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
SelfGrader proposes a lightweight, robust guardrail for detecting LLM jailbreaks by formulating the detection problem as a numerical grading task using anchored token-level logits, achieving strong performance across various benchmarks.
XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited token context.
Papers
XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts
XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…