Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Alberto Giaretta

Alberto Giaretta

2 indexed papers

Recent (6 mo)
2
With code
0
Influential cites
0
Benchmarked
0

Publications per year

2
26

Top categories

AI×2Crypto×2

Frequent co-authors

Matteo Gioele Collu2×
Riccardo Conte2×
Denis Kleyko2×
Mauro Conti2×
Matteo Zavatteri2×
Roberto Confalonieri2×

Research Timeline

2026
Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detection and exploitation.

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detection and exploitation.

Highlighted terms show continued research focus across papers

Papers

cs.AIcs.CRRecentMay 27, 2026

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Matteo Gioele Collu, Riccardo Conte, Alberto Giaretta, Denis Kleyko +3 more

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detectio…

View →
cs.AIcs.CRRecentMay 27, 2026

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Matteo Gioele Collu, Riccardo Conte, Alberto Giaretta, Denis Kleyko +3 more

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detectio…

View →