Junhao Li
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes Functional Subspace Watermarking (FSW), a robust method that embeds ownership signals into a stable, low-dimensional functional subspace of LLMs, significantly improving detection accuracy against model modifications.
The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant gap between public concern and platform safeguards.
The paper proposes Safety Context Injection (SCI), an inference-time framework that prepends a structured external risk report to protect Large Reasoning Models (LRMs) against sophisticated jailbreaks, significantly reducing attack success rates.
Papers
Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis
Zhenhao Xu, Wenhan Chang, Yichuan Chen, Yuxin Fang +2 more
The paper proposes Safety Context Injection (SCI), an inference-time framework that prepends a structured external risk report to protect Large Reasoning Models (LRMs) against sophisticated jailbreaks…