Zhi Zhang
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes DP-SUM-CUSUM, a differentially private method for detecting synchronized distributional changes across multiple data streams, explicitly characterizing the privacy-efficiency trade-off.
The paper advocates for integrating explicit contextual feedback (like reviews and comments) into LLM-based recommender systems to achieve more personalized, transparent, and semantically aligned recommendations.
TriLens is a white-box detector that monitors the entropy of three internal streams (attention, feed-forward, residual) at every layer of a language model to detect hallucinations by tracking how internal certainty forms.
QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in RL performance.
Papers
QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards
Rongzhi Zhang, Rui Feng, Zhihan Zhang, Jingfeng Yang +7 more
QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in…