Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Haoran Li

Haoran Li

5 indexed papers

Recent (6 mo)
5
With code
0
Influential cites
0
Benchmarked
0

Publications per year

5
26

Top categories

Crypto×4AI×2Vision×1ML×1

Frequent co-authors

Yangqiu Song4×
Xi Yang2×
Chang Liu2×
Weiming Zhang2×
Zhenglin Huang1×
Jian Weng1×

Research Timeline

2026
REFORGE: Multi-modal Attacks Reveal Vulnerable Concept Unlearning in Image Generation Models

The paper introduces REFORGE, a black-box red-teaming framework that uses adversarial image prompts to reveal persistent vulnerabilities in current Image Generation Model Unlearning (IGMU) methods.

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard model that effectively detects prompt injection attacks in vulnerable web agents without compromising their functionality.

Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries

The paper introduces Jargon, a novel adversarial framework that exploits the vulnerability of LLMs to context-specific safety boundary blurring, achieving high attack success rates across multiple frontier models.

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, significantly boosting agent performance.

Steering LLM Viewpoints through Fabricated Evidence Injection

This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.

Highlighted terms show continued research focus across papers

Papers

cs.CRRecentJun 4, 2026

Steering LLM Viewpoints through Fabricated Evidence Injection

Xi Yang, Chang Liu, Zhenglin Huang, Haoran Li +3 more

This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.

View →
cs.AIRecentMay 31, 2026

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang +10 more

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, sign…

View →
cs.CRRecentApr 17, 2026

Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries

Ki Sen Hung, Xi Yang, Chang Liu, Haoran Li +6 more

The paper introduces Jargon, a novel adversarial framework that exploits the vulnerability of LLMs to context-specific safety boundary blurring, achieving high attack success rates across multiple fro…

View →
cs.CRRecentApr 14, 2026

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

Yulin Chen, Tri Cao, Haoran Li, Yue Liu +6 more

The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard model that effectively detects prompt injection attacks in vulnerable web agents without compromising their functionality…

View →
cs.CVcs.AIcs.CRRecentMar 17, 2026

REFORGE: Multi-modal Attacks Reveal Vulnerable Concept Unlearning in Image Generation Models

Yong Zou, Haoran Li, Fanxiao Li, Shenyang Wei +4 more

The paper introduces REFORGE, a black-box red-teaming framework that uses adversarial image prompts to reveal persistent vulnerabilities in current Image Generation Model Unlearning (IGMU) methods.

View →