Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Jiaqing Li

Jiaqing Li

3 indexed papers

Recent (6 mo)
3
With code
0
Influential cites
0
Benchmarked
0

Publications per year

3
26

Top categories

NLP×1ML×1AI×1Crypto×1

Frequent co-authors

Jiaqing Liang2×
Deqing Yang2×
Wangyi Mei1×
Zhouhong Gu1×
Zhenhan Bai1×
Yin Cai1×

Research Timeline

2026
When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation

The paper proposes ProRL, an effective Reinforcement Learning framework that rectifies gradient estimation deficiencies to optimize proactive recommendation paths, significantly outperforming existing state-of-the-art methods.

Deep Research as Rubric for Reinforcement Learning

The paper proposes Deep Research as Rubric (DR-rubric), a novel evidence-driven framework that treats rubric construction itself as a research problem to generate fine-grained, scalable reward signals for open-ended reasoning tasks.

Highlighted terms show continued research focus across papers

Papers

cs.CLRecentMay 31, 2026

Deep Research as Rubric for Reinforcement Learning

Wangyi Mei, Zhouhong Gu, Zhenhan Bai, Yin Cai +8 more

The paper proposes Deep Research as Rubric (DR-rubric), a novel evidence-driven framework that treats rubric construction itself as a research problem to generate fine-grained, scalable reward signals…

View →
cs.LGcs.AIRecentMay 27, 2026

ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation

Hongru Hou, Tiehua Mei, Denghui Geng, Jinhui Huang +4 more

The paper proposes ProRL, an effective Reinforcement Learning framework that rectifies gradient estimation deficiencies to optimize proactive recommendation paths, significantly outperforming existing…

View →
cs.CRRecentApr 1, 2026

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

Jiaqing Li, Zhibo Zhang, Shide Zhou, Yuxi Li +2 more

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

View →