~ similar to 2604.16515v1· 20 results
The paper evaluates the adversarial robustness of two open-source Vision-Language Models (LLaVA and Qwen2.5-VL) in a simulated e-commerce environment, finding that while LLaVA is vulnerable to gradien…
Hao Yang, Zhuo Ma, Yang Liu, Yilong Yang +2 more
The paper introduces CrossMPI, a novel cross-modal prompt injection attack that uses image-only perturbations to steer the interpretation of both textual and visual inputs in Large Vision-Language Mod…
Haolin Deng, Xin Zou, Zhiwei Jin, Chen Chen +2 more
The paper proposes In-Context Visual Contrastive Optimization (IC-VCO) to rigorously mitigate multimodal hallucinations in Vision-Language Models by optimizing contrastive learning within a shared mul…
This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…
The paper introduces Evidence-Carrying Agents (ECA) to prevent multimodal agents from executing privileged actions based on unsupported or hallucinated perceptual claims, achieving near-zero unsafe ex…
The paper proposes a novel cross-modal backdoor attack that exploits the vulnerability of lightweight connectors in multimodal LLMs, demonstrating high attack success rates across different modalities…
The paper proposes BRACS, a training-free steering framework that adaptively corrects visual grounding failures in large vision-language models, significantly reducing object hallucination without sac…
The paper models the cybersecurity arms race using a contest-theoretic framework, showing that full cross-correlation of threat intelligence can neutralize the attacker's structural advantage from inc…
Chengshuai Zhao, Zhen Tan, Dawei Li, Zhiyuan Yu +1 more
The paper proposes MMGuard, a proactive defense mechanism that injects unlearnable, human-imperceptible perturbations into multimodal data to prevent unauthorized fine-tuning of Large Vision-Language…
Qian Chen, Xianyin Zhang, Yanzhi Liu, Lifan Guo +2 more
This paper introduces CFMME, a comprehensive Chinese financial multimodal benchmark, and evaluates current Large Vision-Language Models (LVLMs), finding that while state-of-the-art models perform mode…
Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong +1 more
The paper introduces Hidden Ads, a novel backdoor attack for Vision-Language Models (VLMs) that injects unauthorized advertisements by exploiting natural, recommendation-seeking user behaviors, mainta…
The paper proposes a utility-aware multimodal contrastive learning framework that optimizes product image generation not just for semantic coherence, but also for maximizing consumer demand in online…
Yiyang Zhang, Chaojian Yu, Ziming Hong, Yuanjie Shao +3 more
The paper proposes a novel Text-Guided Backdoor (TGB) attack that uses common words in text descriptions as stealthy triggers for multimodal models, enhancing practicality and controllability.
ContractShield is a robust multimodal framework that uses a novel three-level fusion mechanism to accurately detect multiple types of vulnerabilities in obfuscated smart contracts, significantly outpe…
Zixin Zhang, Fan Qi, Shuai Li, Xiaoshan Yang +1 more
The paper proposes FedMChain, a novel federated learning framework that structures multimodal training into sequential phases to mitigate modality competition and improve model performance while reduc…
The paper introduces a dual-dimension evaluation for universal adversarial attacks on Vision-Language Models (VLMs), demonstrating that high reported attack success rates significantly overestimate th…
Buyun Liang, Jinqi Luo, Liangzu Peng, Kwan Ho Ryan Chan +5 more
The paper introduces REALISTA, a novel latent-space adversarial attack framework that generates semantically realistic and coherent prompts to effectively induce hallucinations in large language model…
The paper introduces ImageProtector, a user-side method that embeds an imperceptible perturbation into images to prevent Multi-modal Large Language Models (MLLMs) from analyzing and extracting sensiti…
Chenhao Fang, Jordi Mola, Mark Harman, Jason Nawrocki +9 more
The paper introduces a Hybrid Utility Minimum Bayes Risk (HUMBR) framework to significantly reduce hallucinations in high-stakes enterprise AI workflows, outperforming standard consistency methods.
Ye Sun, Xin Wang, Jiaming Zhang, Yifeng Gao +6 more
DarkLLM introduces a novel framework that uses a Large Language Model (LLM) to translate natural language instructions into flexible, latent adversarial attack vectors, demonstrating a systemic vulner…