Propagating Unsafe Actions in LLM Controlled Multi-Robot Collaboration via Single Robot Compromise

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Em…

PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

PlanTwin introduces a privacy-preserving architecture that allows cloud-hosted L…

Parallax: Why AI Agents That Think Must Never Act

The paper introduces Parallax, an architectural framework that structurally sepa…

Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models

The paper proposes the Expected Safety Impact (ESI) framework to identify safety…

The Verifier Tax: Horizon Dependent Safety Success Tradeoffs in Tool Using LLM Agents

The paper analyzes how runtime safety enforcement impacts the performance of mul…

The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems

The paper introduces Salami Slicing Risk, a novel multi-turn jailbreak technique…

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

The paper introduces an automated framework demonstrating that LLM system instru…

SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration

The paper proposes SkillProbe, a multi-agent security auditing framework, demons…