Yihan Wang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces 'infilling extraction' to accurately model training data memorization in Diffusion Language Models (DLMs), finding that bidirectional masking significantly increases the extractability of verbatim training data compared to traditional prefix-only methods.
The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are invisible when analyzing individual stages.
Papers
Sequential Data Poisoning in LLM Post-Training
Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath +1 more
The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are in…