Xiang Fang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes the Adversarial Prompt Disentanglement (APD) framework, a novel defense that proactively identifies and neutralizes malicious components in LLM prompts, achieving over 85% reduction in harmful outputs.
The paper proposes the Adversarial Prompt Disentanglement (APD) framework, a novel defense mechanism that proactively identifies and neutralizes malicious components in LLM prompts, achieving over 85% reduction in harmful outputs.
Papers
Disentangling Adversarial Prompts: A Semantic-Graph Defense for Robust LLM Security
The paper proposes the Adversarial Prompt Disentanglement (APD) framework, a novel defense that proactively identifies and neutralizes malicious components in LLM prompts, achieving over 85% reduction…