Jing Chen
11 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.
The paper introduces CAN-QA, a novel question-answering benchmark that reformulates CAN traffic analysis from a classification task to a reasoning task, demonstrating that current LLMs struggle with complex temporal and behavioral reasoning over vehicle network data.
The paper introduces PhishEye, a fully dynamic self-supervised system that models Ethereum transactions as a heterogeneous temporal attributed multi-graph and uses temporal graph contrastive learning to achieve high accuracy in detecting phishing activities.
EvoPoC introduces a knowledge-driven agentic system that automates the synthesis of verifiable and economically viable exploits for DeFi smart contracts, achieving high recall and significant revenue recovery rates.
The paper introduces Babel, an efficient black-box attack framework that systematically exploits intrinsic safety gaps in LLMs by optimizing text obfuscation sampling, achieving state-of-the-art jailbreak success rates on commercial models.
The paper introduces PlanAudio, a unified LLM-based framework that directly synthesizes natural, composite audio containing speech and sounds from unconstrained free-form text prompts, outperforming existing methods.
The paper introduces Score-Guided Classification (SGC), a novel framework that uses an unsupervised anomaly score as a 'Pathological Prior' to guide EEG-based depression detection, overcoming the limitations of data augmentation in small-sample settings.
The paper introduces CV-Arena, a large-scale open benchmark for instructional computer vision, demonstrating that professional-grade image editing requires advanced capabilities in physical reasoning and structural control.
The paper introduces InsightVQA, a large-scale benchmark dataset designed for hierarchical visual question answering that assesses complex emotion understanding and cognitive reasoning beyond simple emotion recognition.
The paper introduces RoboTrustBench, a comprehensive benchmark that evaluates the trustworthiness of video world models for robotic manipulation across challenging scenarios, finding that current models fail in complex reasoning and safety checks.
The paper proposes a novel framework, LPCD, that uses latent causal modeling to robustly assess evolving adversarial risks in live streaming by decoupling malicious intent from superficial tactical shifts.
Papers
InsightVQA: High-Dimensional Emotion-Cognitive Visual Question Answering Benchmark
Shiyu Wang, Ziyu Liu, Chaoyi Yu, Yujie Yin +5 more
The paper introduces InsightVQA, a large-scale benchmark dataset designed for hierarchical visual question answering that assesses complex emotion understanding and cognitive reasoning beyond simple e…