Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators

Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

This paper introduces a novel framework, the Reasoning Safety Monitor, to detect…

Strengthening Human-Centric Chain-of-Thought Reasoning Integrity in LLMs via a Structured Prompt Fra…

The paper proposes a structured prompt engineering framework to enhance the inte…

Critical-CoT: A Robust Defense Framework against Reasoning-Level Backdoor Attacks in Large Language…

The paper introduces Critical-CoT, a novel two-stage fine-tuning defense framewo…

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard m…

Safety, Security, and Cognitive Risks in World Models

This paper surveys the risks associated with world models, proposing a unified t…

TRAP: Hijacking VLA CoT-Reasoning via Adversarial Patches

This paper introduces TRAP, an adversarial attack that demonstrates how physical…

KidsNanny: A Two-Stage Multimodal Content Moderation Pipeline Integrating Visual Classification, Obj…

KidsNanny is a two-stage multimodal content moderation pipeline that achieves hi…

SecPI: Secure Code Generation with Reasoning Models via Security Reasoning Internalization

The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning langua…