Jian Lou

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×2Crypto×1

Frequent co-authors

Zhihao Liu1×

Yifan Wu1×

Di Wang1×

Yuxi Zhou1×

Yuke Hu1×

Kejia Chen1×

Research Timeline

2026

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration

The paper proposes mitigating the progressive degradation of safety in language models caused by many-shot jailbreak attacks by appending a single, fixed safety demonstration at inference time.

Aligned but Fragile: Enhancing LLM Safety Robustness via Zeroth-Order Optimization

The paper proposes a novel zeroth-order optimization framework to enhance the robustness of LLM safety alignment, showing that few refinement steps can significantly improve safety while maintaining utility.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 28, 2026

Aligned but Fragile: Enhancing LLM Safety Robustness via Zeroth-Order Optimization

Zhihao Liu, Yifan Wu, Jian Lou, Di Wang +2 more

View →

cs.CRcs.AIRecentMay 8, 2026