Weiwei Qi
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes the Expected Safety Impact (ESI) framework to identify safety-critical parameters in LLMs, introducing targeted tuning methods (SET and SPA) to enhance safety and preserve alignment during model adaptation.
The paper proposes TRACE, a novel agentic jailbreaking framework that successfully bypasses safety mechanisms of advanced LLM agents by decomposing malicious tasks and disguising harmful subtasks within task-aware, iteratively evolved scenarios.
Papers
TRACE: Task-Aware Adaptive Self-Evolving Agentic Jailbreaking
Churui Zeng, Weiwei Qi, Kedong Xiu, Tianhang Zheng +4 more
The paper proposes TRACE, a novel agentic jailbreaking framework that successfully bypasses safety mechanisms of advanced LLM agents by decomposing malicious tasks and disguising harmful subtasks with…