Shengfang Zhai
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces BadDLM, a unified framework that demonstrates a new class of backdoor vulnerabilities in Diffusion Language Models (DLMs) by exploiting their forward masking process across diverse targets.
The paper proposes defining 'intent-to-execution integrity' as the necessary end-to-end correctness property for securing LLM agents, arguing that current defenses are insufficient due to untrusted components.
Papers
Securing LLM Agents Need Intent-to-Execution Integrity
Wenjie Qu, Ming Xu, Peiran Wang, Shengfang Zhai +2 more
The paper proposes defining 'intent-to-execution integrity' as the necessary end-to-end correctness property for securing LLM agents, arguing that current defenses are insufficient due to untrusted co…