Guowen Xu
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper investigates how various fine-tuning methods can be used both to intentionally misalign and subsequently realign large language models (LLMs), revealing distinct strengths for attack and defense mechanisms.
This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a significant and often overlooked copyright risk.
Papers
Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study
Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more
This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…