Yangqiu Song

7 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×4AI×3NLP×1

Frequent co-authors

Haoran Li4×

Xi Yang3×

Chang Liu3×

Weiming Zhang2×

Tsun On Kwok2×

Ki Sen Hung2×

Research Timeline

2026

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard model that effectively detects prompt injection attacks in vulnerable web agents without compromising their functionality.

Into the Gray Zone: Domain Contexts Can Blur LLM Safety Boundaries

The paper introduces Jargon, a novel adversarial framework that exploits the vulnerability of LLMs to context-specific safety boundary blurring, achieving high attack success rates across multiple frontier models.

HypoAgent: An Agentic Framework for Interactive Abductive Hypothesis Generation over Knowledge Graphs

HypoAgent is an agentic framework that enables interactive, multi-turn abductive hypothesis generation over knowledge graphs, achieving state-of-the-art performance by integrating specialized agents for intent grounding, hypothesis generation, and root cause analysis.

PatchWorld: Gradient-Free Optimization of Executable World Models

PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, significantly boosting agent performance.

Steering LLM Viewpoints through Fabricated Evidence Injection

This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.

SentinelRAG: Synthetic Sentinel Knowledge for RAG Database Copyright Protection

SentinelRAG introduces a novel watermarking framework that embeds style-consistent, fictitious knowledge entries into RAG databases, allowing for reliable detection of unauthorized redistribution while minimizing impact on legitimate queries.

Highlighted terms show continued research focus across papers

Papers

cs.CRRecentJun 4, 2026

Steering LLM Viewpoints through Fabricated Evidence Injection

Xi Yang, Chang Liu, Zhenglin Huang, Haoran Li +3 more

This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.

View →

cs.CRRecentJun 4, 2026