Yu Yin
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes HTell, a fast and lightweight data-free backdoor detector that analyzes the abnormal response concentration of backdoored models on the target class using random latent probes applied directly to the prediction head.
This paper re-evaluates prompt-injection attacks in realistic RAG settings, finding that most prior attack methods fail to reach the generator, and that current attacks are easily detectable.
QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in RL performance.
Papers
QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards
Rongzhi Zhang, Rui Feng, Zhihan Zhang, Jingfeng Yang +7 more
QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in…