Yanming Guo

5 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×5NLP×4AI×1

Frequent co-authors

Yunhao Feng3×

Yifan Ding3×

Yingshui Tan3×

Wenke Huang3×

Xiaohu Du2×

Ming Wen2×

Research Timeline

2026

SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

SkillTrojan introduces a novel backdoor attack targeting the composition of reusable skills in agent systems, demonstrating high attack success rates with minimal impact on normal system functionality.

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

EvoDefense introduces an experience-guided, co-evolving black-box defense mechanism that significantly improves LLM robustness against unseen and diverse attacks without requiring model retraining.

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

EvoDefense introduces an experience-guided, co-evolving black-box defense mechanism that significantly improves the robustness of LLMs against unseen and diverse attacks without requiring model retraining.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

BraveGuard is a self-evolving defense framework that improves the safety of computer-use agents by training guard models on open-world, multi-step threat trajectories rather than static benchmarks.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

BraveGuard is a self-evolving defense framework that significantly improves the safety monitoring of computer-use agents by generating guard model supervision from open-world threat discovery and realistic, multi-step execution trajectories.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.CLRecentMay 31, 2026

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

Yunhao Feng, Yifan Ding, Xiaohu Du, Ming Wen +12 more

BraveGuard is a self-evolving defense framework that improves the safety of computer-use agents by training guard models on open-world, multi-step threat trajectories rather than static benchmarks.

View →

cs.CRcs.CLRecentMay 31, 2026