Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Jiacheng Liang

Jiacheng Liang

2 indexed papers

Recent (6 mo)
2
With code
0
Influential cites
0
Benchmarked
0

Publications per year

2
26

Top categories

Crypto×2AI×2NLP×1ML×1

Frequent co-authors

Yuhui Wang1×
Tanqiu Jiang1×
Charles Fleming1×
Ting Wang1×
Yao Ma1×
Tharindu Kumarage1×

Research Timeline

2026
ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System

ARES is a novel framework that systematically discovers and mitigates dual vulnerabilities in RLHF systems by simultaneously testing the core LLM and its Reward Model (RM) using structured adversarial prompts, leading to enhanced safety robustness.

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step interactions.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIcs.CLRecentMay 4, 2026

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more

The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…

View →
cs.AIcs.CRcs.LGRecentApr 20, 2026

ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System

Jiacheng Liang, Yao Ma, Tharindu Kumarage, Satyapriya Krishna +4 more

ARES is a novel framework that systematically discovers and mitigates dual vulnerabilities in RLHF systems by simultaneously testing the core LLM and its Reward Model (RM) using structured adversarial…

View →