Alan Ritter

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×2NLP×1AI×1

Frequent co-authors

Ruohao Guo1×

Wei Xu1×

Minju Gwak1×

Minseo Kwak1×

Dongseok Lee1×

Guijin Son1×

Research Timeline

2026

LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training

The paper proposes LaRA, a layer-wise representation analysis framework that detects data contamination in RL post-trained LLMs by analyzing geometric deviations across model layers.

Investigating and Alleviating Harm Amplification in LLM Interactions

This paper introduces HarmAmp, a new benchmark for multi-turn harm amplification, and proposes TrajSafe, a proactive monitoring system that significantly reduces harmfulness in LLM interactions while maintaining usability.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.LGRecentJun 1, 2026

Investigating and Alleviating Harm Amplification in LLM Interactions

Ruohao Guo, Wei Xu, Alan Ritter

View →

cs.LGcs.AIRecentMay 28, 2026