Chenhao Lin
5 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper analyzes that while multimodal large language models (MLLMs) offer superior semantic understanding for image generation, this enhanced capability significantly increases safety risks, particularly in generating unsafe content and creating harder-to-detect fake images compared to traditional diffusion models.
This paper introduces TwoHamsters, a new benchmark that rigorously tests Multi-Concept Compositional Unsafety (MCCU) in text-to-image models, demonstrating that current state-of-the-art models and safety defenses are highly vulnerable to subtle, compositionally unsafe prompts.
MACReD introduces a hierarchical multi-agent framework that achieves state-of-the-art performance in parsing complex chemical reaction diagrams by coordinating specialized agents for perception and global reasoning.
SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe actions rather than just the final outcome.
SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.
Papers
SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents
Hao Cheng, Changtao Miao, Tianle Song, Yin Wu +20 more
SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe ac…