Pengyuan Li
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Dr. DocBench, a difficulty-aware, comprehensive benchmark designed to rigorously test expert-level and challenging document parsing capabilities for VLMs, demonstrating that current state-of-the-art models fail on complex, domain-specific structures.
THRD introduces a novel, training-free framework that models temporal risk accumulation to effectively defend against multi-turn jailbreak attacks on LLMs, significantly reducing attack success rates while maintaining model utility.
The paper demonstrates that explicit gender cues systematically affect LLM value trade-offs, causing decision flips that are often masked or misattributed by the models themselves.
NeuroArmor is a white-box runtime defense that uses prompt-specific safe variants to selectively detect and mitigate jailbreak attacks, significantly reducing attack success rates while maintaining a low false positive rate.
Papers
NeuroArmor: Safe-Variant-Guided Representation Consistency for Selective Re-Anchoring in Jailbreak Defense
NeuroArmor is a white-box runtime defense that uses prompt-specific safe variants to selectively detect and mitigate jailbreak attacks, significantly reducing attack success rates while maintaining a…