Jason Pacheco
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
ML×1AI×1Crypto×1
Frequent co-authors
Research Timeline
2026
Information Theoretic Adversarial Training of Large Language Models
The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial examples within a divergence ball.
Highlighted terms show continued research focus across papers
Papers
cs.LGcs.AIcs.CRRecentMay 6, 2026
Information Theoretic Adversarial Training of Large Language Models
Yiwei Zhang, Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia +2 more
The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial exam…
View →