Yihao Zhang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Salami Slicing Risk, a novel multi-turn jailbreak technique that accumulates harmful intent through numerous low-risk inputs, achieving state-of-the-art attack success rates against major LLMs.
VOW introduces a novel, privacy-preserving, and cryptographically verifiable protocol for detecting watermarks in LLM-generated text, overcoming the limitations of centralized and non-verifiable existing methods.
Papers
VOW: Verifiable and Oblivious Watermark Detection for Large Language Models
Xiaokun Luan, Yihao Zhang, Pengcheng Su, Feiran Lei +1 more
VOW introduces a novel, privacy-preserving, and cryptographically verifiable protocol for detecting watermarks in LLM-generated text, overcoming the limitations of centralized and non-verifiable exist…