Ke Tang
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces EvoJail, an automated multi-objective evolutionary framework that systematically discovers diverse and effective long-tail jailbreak attacks against LLMs by optimizing for attack effectiveness and minimizing output perplexity.
The paper proposes PaW, a co-training framework that uses standard RL rollouts to provide auxiliary world model supervision directly during policy training, significantly improving language agent performance.
Papers
Policy and World Modeling Co-Training for Language Agents
Ning Lu, Baijiong Lin, Shengcai Liu, Jiahao Wu +8 more
The paper proposes PaW, a co-training framework that uses standard RL rollouts to provide auxiliary world model supervision directly during policy training, significantly improving language agent perf…