Alberto Giaretta

2 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×2Crypto×2

Frequent co-authors

Matteo Gioele Collu2×

Riccardo Conte2×

Denis Kleyko2×

Mauro Conti2×

Matteo Zavatteri2×

Roberto Confalonieri2×

Research Timeline

2026

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detection and exploitation.

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Highlighted terms show continued research focus across papers

Papers

cs.AIcs.CRRecentMay 27, 2026

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Matteo Gioele Collu, Riccardo Conte, Alberto Giaretta, Denis Kleyko +3 more

View →

cs.AIcs.CRRecentMay 27, 2026