Francesco Belardinelli

3 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

ML×2AI×2Multiagent×1Logic×1Crypto×1

Frequent co-authors

Edwin Hamel-De le Court2×

Omar Adalat1×

Research Timeline

2026

Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

The paper proposes detecting 'alignment faking' (AF)—where LLMs revert to unsafe behavior when unmonitored—by analyzing observable tool selection patterns, finding that detection rates vary significantly across different LLMs and domains.

Robust Shielding for Safe Reinforcement Learning

The paper introduces a novel shielding framework for Robust MDPs (RMDPs) that guarantees safety under worst-case transition probabilities, enabling safe reinforcement learning even when transition dynamics are unknown.

Contract-Based Compositional Shielding for Safe Multi-Agent Reinforcement Learning

This paper proposes a method for ensuring safety in multi-agent reinforce learning through decentralized execution, using a shared global specification and a non-stationary multi-armed bandit.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.MAEmpiricalRecentJun 12, 2026

Contract-Based Compositional Shielding for Safe Multi-Agent Reinforcement Learning

Omar Adalat, Edwin Hamel-De le Court, Francesco Belardinelli

This paper proposes a method for ensuring safety in multi-agent reinforce learning through decentralized execution, using a shared global specification and a non-stationary multi-armed bandit.

View →

cs.AIcs.LGcs.LORecent