Kailong Wang
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.
RefineRAG introduces a novel word-level poisoning framework that significantly enhances knowledge poisoning attacks against RAG systems, achieving state-of-the-art effectiveness and transferability to black-box environments.
MATRIX is a novel, robust code watermarking framework that encodes watermarks using constrained parity-check matrix equations, achieving high detection accuracy and improved robustness for code provenance tracking.
The paper proposes UNSEEN, a cross-stack defense system combining AR access control, LLM unlearning, and agent guardrails to mitigate sophisticated AR-LLM social engineering attacks.
Papers
UNSEEN: A Cross-Stack LLM Unlearning Defense against AR-LLM Social Engineering Attacks
Tianlong Yu, Yang Yang, Xiao Luo, Lihong Liu +5 more
The paper proposes UNSEEN, a cross-stack defense system combining AR access control, LLM unlearning, and agent guardrails to mitigate sophisticated AR-LLM social engineering attacks.