~ similar to 2604.27674v1· 20 results
Xinlei Guan, David Arosemena, Tejaswi Dhandu, Kuan Huang +6 more
The paper proposes an end-to-end forensic pipeline using steganographic attribution and multimodal harm detection to reliably trace and attribute harmful misuse of AI-generated imagery on social platf…
ContractShield is a robust multimodal framework that uses a novel three-level fusion mechanism to accurately detect multiple types of vulnerabilities in obfuscated smart contracts, significantly outpe…
The paper introduces Multi-Clip Video (MCV) SafetyBench, a dataset demonstrating that the vulnerability of Multimodal Large Language Models (MLLMs) to jailbreaking increases with the diversity and num…
The paper introduces a robust, two-part framework (HyPE and HyPS) using hyperbolic geometry to efficiently detect and sanitize malicious prompts targeting Vision-Language Models (VLMs).
Chaoshuo Zhang, Yibo Liang, Mengke Tian, Chenhao Lin +5 more
This paper introduces TwoHamsters, a new benchmark that rigorously tests Multi-Concept Compositional Unsafety (MCCU) in text-to-image models, demonstrating that current state-of-the-art models and saf…
Alexander Nemecek, Osama Zafar, Yuqiao Xu, Wenbiao Li +1 more
The paper argues that current AI content watermarking benchmarks fail to test for bias across different languages, cultures, and demographics, proposing a new set of evaluation standards to ensure fai…
The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…
The paper proposes a novel cross-modal backdoor attack that exploits the vulnerability of lightweight connectors in multimodal LLMs, demonstrating high attack success rates across different modalities…
Hanxi Li, Jianan Zhou, Jiale Lao, Yibo Wang +4 more
The paper introduces the Black-Hole Attack, a poisoning vulnerability that exploits geometric defects in high-dimensional embedding spaces to force malicious vectors into the top-k results of vector d…
The paper introduces a dual-dimension evaluation for universal adversarial attacks on Vision-Language Models (VLMs), demonstrating that high reported attack success rates significantly overestimate th…
KidsNanny is a two-stage multimodal content moderation pipeline that achieves high accuracy and efficiency in detecting child safety threats, particularly excelling in text-embedded content.
The paper introduces ImageProtector, a user-side method that embeds an imperceptible perturbation into images to prevent Multi-modal Large Language Models (MLLMs) from analyzing and extracting sensiti…
Yong Zou, Haoran Li, Fanxiao Li, Shenyang Wei +4 more
The paper introduces REFORGE, a black-box red-teaming framework that uses adversarial image prompts to reveal persistent vulnerabilities in current Image Generation Model Unlearning (IGMU) methods.
This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…
This study provides the first large-scale analysis of video piracy on Telegram, quantifying its massive financial impact and developing a resilient detection framework, Anti-RIP, to combat it.
Yani Wang, Yilong Yang, Yang Liu, Zhuzhu Wang +2 more
The paper introduces Distributed Semantic Recomposition (DSR), a novel cross-modal jailbreaking framework that bypasses existing safety filters by decomposing harmful intent into benign input componen…
The paper proposes a decoupled two-stage training pipeline to effectively learn a shared representation for person re-identification by mitigating optimization conflicts between image-based and text-b…
PASA introduces a robust, semantic-level watermarking technique that embeds and detects watermarks in the latent embedding space, successfully resisting semantic-invariant attacks like paraphrasing.
This paper introduces ComicJailbreak, a new benchmark demonstrating that structured visual narratives can effectively jailbreak Multimodal Large Language Models (MLLMs), requiring new safety alignment…
The paper demonstrates that high detection performance against obfuscated prompts does not guarantee representational robustness, identifying a phenomenon called latent embedding collapse.