Qinghua Mao
6 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
CoopGuard is a novel stateful, multi-round defense framework using cooperative agents to significantly reduce the success rate of evolving adversarial attacks against Large Language Models.
The paper introduces EditRisk-Bench, a novel benchmark designed to systematically evaluate the safety risks and downstream reasoning corruption caused by malicious knowledge editing in large language models.
The paper proposes FRA-Attack, a frequency-domain regularization method, to significantly improve the transferability of adversarial attacks against closed-source Multimodal Large Language Models (MLLMs).
This paper introduces GraphSteal, an attack framework demonstrating that Graph RAG systems can leak substantial portions of a hidden knowledge graph by treating them as structural oracles.
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.
Papers
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security
Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.