The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe actions across the entire system.
Large language models (LLMs) are increasingly used as general planners in embodied intelligence, enabling high level coordination and low level task planning for both single robot and multi-robot collaboration. This increasing reliance on embodied LLM planners also raises critical security concerns, since misaligned or manipulated instructions can be translated into physical actions. Prior work has studied such threats in single robot settings, while security risks in LLM controlled multi-robot collaboration, especially those propagated through inter robot communication, remain largely unexplored. To bridge this gap, we propose a novel attack paradigm for multi-robot system in which the adversary interacts with only a single entry robot. The compromised robot then propagates malicious intent through peer communication, leading to coordinated unsafe actions across the system. Our evaluation, covering high risk dimensions of dereliction of duty, privacy compromise, and public safety hazards, reveals a persistent safety alignment gap in multi-robot planners. We quantify this process with three metrics, obedience, infectiousness, and stealthiness. Experiments demonstrate both persistent attacker control and rapid propagation: obedience reaches 1.00 in the strongest cases, and infectiousness rises to 0.90. Notably, the attack is highly efficient, requiring as few as 3.0 rounds to compromise all the robots while maintaining a stealthiness score of 0.81. Such risks are amplified when robots must resolve trade offs in critical situations, such as emergencies or conflicts of rights, because the coordination mechanism can unintentionally allow adversarial instructions to override safety requirements. The code is available at https://github.com/TheFatInsect/InfectBot.
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses
This survey provides a comprehensive, structured review of safety research in Em…
PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents
PlanTwin introduces a privacy-preserving architecture that allows cloud-hosted L…
Parallax: Why AI Agents That Think Must Never Act
The paper introduces Parallax, an architectural framework that structurally sepa…
Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models
The paper proposes the Expected Safety Impact (ESI) framework to identify safety…
The Verifier Tax: Horizon Dependent Safety Success Tradeoffs in Tool Using LLM Agents
The paper analyzes how runtime safety enforcement impacts the performance of mul…
The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems
The paper introduces Salami Slicing Risk, a novel multi-turn jailbreak technique…
Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks
The paper introduces an automated framework demonstrating that LLM system instru…
SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration
The paper proposes SkillProbe, a multi-agent security auditing framework, demons…