Ari Holtzman

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×2ML×1Crypto×1

Frequent co-authors

Todd Nief1×

Harvey Yiyun Fu1×

Mark Muchane1×

Peter West1×

Research Timeline

2026

Can You Keep a Secret? Involuntary Information Leakage in Language Model Writing

Frontier language models involuntarily leak secret information through thematic elements in their writing, even when explicitly instructed to keep the secret hidden.

Subliminal Learning is a LoRA Artifact

The paper demonstrates that the phenomenon of 'subliminal learning,' where behavioral traits are transmitted between language models, is not a fundamental learning mechanism but rather a fragile artifact of LoRA fine-tuning and specific contextual tokens.

Highlighted terms show continued research focus across papers

Papers

cs.AIcs.LGRecentMay 30, 2026

Subliminal Learning is a LoRA Artifact

Todd Nief, Harvey Yiyun Fu, Mark Muchane, Ari Holtzman

View →

cs.CRcs.AIRecentMay 11, 2026