Xiao Li
17 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces REFORGE, a black-box red-teaming framework that uses adversarial image prompts to reveal persistent vulnerabilities in current Image Generation Model Unlearning (IGMU) methods.
This paper systematically audits the safety implications of activation steering vectors, finding that these vectors significantly influence the success rate of jailbreak attacks by overlapping with latent refusal directions.
This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.
The paper proposes Mean MAE (MMAE), a novel self-supervised pre-training framework that uses flow mixing and teacher-student distillation to improve encrypted traffic classification by capturing multi-granularity context.
The paper introduces R$^2$A, an adversarial attack that uses suffix optimization to mislead black-box LLM routers into consistently selecting expensive, high-capability models.
The paper proposes IntraGuard, a black-box, venue-agnostic defense framework that embeds hidden instructions into manuscripts via PDF structure to disrupt AI-generated peer reviews, achieving up to 84% defense success.
The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant gap between public concern and platform safeguards.
This paper introduces UPAttack, a novel threat model demonstrating that focusing on explicit usability requirements can cause LLMs to generate insecure code by neglecting implicit security constraints, and proposes U-SPLOIT to automate this attack.
The paper introduces FlowSteer, a prompt-only attack that exploits vulnerabilities in how multi-agent LLM systems plan workflows, significantly increasing the success rate of malicious signal propagation.
This paper systematically investigates how various plasticity interventions affect the vulnerability of deep reinforcement learning agents to backdoor attacks, finding that most interventions mitigate threats while one specific intervention exacerbates them.
The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk.
The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.
This paper addresses the challenge of detecting and explaining AI-manipulated segments within long, untrimmed videos by proposing a new benchmark and a coarse-to-fine forensic detection framework.
InfoMerge is a novel, training-free method that significantly compresses visual tokens for Video-LLMs by estimating temporal redundancy and allocating tokens based on content richness, achieving high efficiency with minimal performance loss.
MOSS-Audio is a unified audio-language model designed for comprehensive understanding of speech, environmental sounds, and music, achieving strong performance across various audio-grounded tasks.
The paper introduces EvoNote, a self-evolving agentic framework that significantly improves the generation of evidence-grounded health community notes by utilizing an accumulated memory of past misinformation correction experiences.
ImageAuditor introduces a novel Membership Inference Attack (MIA) specifically designed for Image-based Retrieval-Augmented Generation (IRAG) systems, achieving high accuracy by addressing cross-modal retrieval and discriminative signal extraction challenges.
Papers
ImageAuditor: Membership Inference Attack against Image-based Retrieval-Augmented Generation
Jinghuai Zhang, Pengyue Yu, Zhexiao Lin, Kunlin Cai +2 more
ImageAuditor introduces a novel Membership Inference Attack (MIA) specifically designed for Image-based Retrieval-Augmented Generation (IRAG) systems, achieving high accuracy by addressing cross-modal…