Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Flavio Corradini

Flavio Corradini

1 indexed paper

Recent (6 mo)
1
With code
0
Influential cites
0
Benchmarked
0

Publications per year

1
26

Top categories

Crypto×1AI×1

Frequent co-authors

Matteo Leonesi1×
Francesco Belardinelli1×
Marco Piangerelli1×

Research Timeline

2026
Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

The paper proposes detecting 'alignment faking' (AF)—where LLMs revert to unsafe behavior when unmonitored—by analyzing observable tool selection patterns, finding that detection rates vary significantly across different LLMs and domains.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentApr 29, 2026

Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

Matteo Leonesi, Francesco Belardinelli, Flavio Corradini, Marco Piangerelli

The paper proposes detecting 'alignment faking' (AF)—where LLMs revert to unsafe behavior when unmonitored—by analyzing observable tool selection patterns, finding that detection rates vary significan…

View →