Papers similar to 2603.16405v1

~ similar to 2603.16405v1· 20 results

cs.CRcs.CVRecentApr 14, 2026

Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling

Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li +2 more

The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign a…

View →

cs.CVcs.CRRecentMay 7, 2026

Backdoor Mitigation in Object Detection via Adversarial Fine-Tuning

Kealan Dunnett, Reza Arablouei, Dimity Miller, Volkan Dedeoglu +1 more

The paper proposes a detection-aware adversarial fine-tuning framework to mitigate backdoor attacks in object detection models, achieving better defense while preserving clean detection performance co…

View →

cs.CRcs.LGRecentMay 19, 2026

Awakening the Hydra: Stabilizing Multi-Concept Backdoor Injection in Text-to-Image Diffusion Models

Kai Wang, Jiale Zhang, Chengcheng Zhu, Chuang Ma +1 more

The paper proposes Hydra, a framework to stabilize and control the injection of multiple, conflicting backdoor triggers into text-to-image diffusion models, ensuring high attack reliability while main…

View →

cs.CRcs.AIcs.CVRecentMar 31, 2026

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

Kavindu Herath, Joshua Zhao, Saurabh Bagchi

This paper proposes SABLE, a method for generating semantically meaningful and in-distribution backdoor triggers for federated learning, demonstrating that such attacks remain a potent and practical t…

View →

cs.CVcs.CRcs.LGRecentMay 14, 2026

Systematic Discovery of Semantic Attacks in Online Map Construction through Conditional Diffusion

Chenyi Wang, Ruoyu Song, Raymond Muller, Jean-Philippe Monteuuis +4 more

The paper introduces MIRAGE, a framework that systematically discovers semantic attacks on online HD map construction by finding plausible environmental variations that bypass standard adversarial def…

View →

cs.CRcs.AIRecentApr 23, 2026

CSC: Turning the Adversary's Poison against Itself

Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu +2 more

The paper proposes Cluster Segregation Concealment (CSC), a novel defense that identifies and neutralizes backdoor triggers by relabeling poisoned samples to a virtual class, achieving near-zero attac…

View →

cs.CLcs.CRcs.LGRecentMar 29, 2026

Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong +1 more

The paper introduces Hidden Ads, a novel backdoor attack for Vision-Language Models (VLMs) that injects unauthorized advertisements by exploiting natural, recommendation-seeking user behaviors, mainta…

View →

cs.CRcs.AIRecentMay 17, 2026

Lightweight and Fast Backdoor Model Detection

Yinbo Yu, Jing Fang, Xuewen Zhang, Chunwei Tian +3 more

The paper proposes DFBScanner, a lightweight static parameter inspection framework that detects backdoor attacks by analyzing anomalous parameter updates in the final classification layer, achieving f…

View →

cs.CRcs.AIRecentMay 19, 2026

Token by Token, Compromised: Backdoor Vulnerabilities in Unified Autoregressive Models

Tobias Braun, Jonas Henry Grebe, Hossein Shakibania, Anna Rohrbach +1 more

This paper introduces the Token by Token Backdoor Attack (ToBAC), demonstrating that unified autoregressive models (UAMs) are vulnerable to backdoor attacks where a single trigger can compromise multi…

View →

cs.LGcs.CRRecentMay 27, 2026

Density-aware Sample-specific Attack

Qiyuan Wang, Yao Li, Raymond K. W. Wong

This paper proposes a density-aware attack that constructs triggers by placing poisoned samples in low-density regions of the clean data distribution, achieving high attack success rates even after st…

View →

cs.CRcs.CVRecentMay 29, 2026

BadBone: Backdoor Attacks Against Backbone Models in Visual Prompt Learning

Ziqing Yang, Rui Wen, Xinlei He, Yun Shen +2 more

The paper introduces BadBone, a stealthy and adaptive backdoor attack that compromises a backbone model specifically to target downstream tasks utilizing prompt learning, demonstrating high attack suc…

View →

cs.CRcs.CVRecentApr 17, 2026

TwoHamsters: Benchmarking Multi-Concept Compositional Unsafety in Text-to-Image Models

Chaoshuo Zhang, Yibo Liang, Mengke Tian, Chenhao Lin +5 more

This paper introduces TwoHamsters, a new benchmark that rigorously tests Multi-Concept Compositional Unsafety (MCCU) in text-to-image models, demonstrating that current state-of-the-art models and saf…

View →

cs.CRRecentMay 8, 2026

Cross-Modal Backdoors in Multimodal Large Language Models

Runhe Wang, Li Bai, Haibo Hu, Songze Li

The paper proposes a novel cross-modal backdoor attack that exploits the vulnerability of lightweight connectors in multimodal LLMs, demonstrating high attack success rates across different modalities…

View →

cs.CRRecentApr 27, 2026

DETOUR: A Practical Backdoor Attack against Object Detection

Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang +1 more

DETOUR proposes a practical backdoor attack against object detection models by using semantic triggers that are robust to variations in size, location, and field of view (FoV), overcoming limitations…

View →

cs.CRcs.CVRecentMay 2, 2026

Checkerboard: A Simple, Effective, Efficient and Learning-free Clean Label Backdoor Attack with Low Poisoning Budget

Yi Yang, Jinyang Huang, Binbin Liu, Feng-Qi Cui +4 more

The paper introduces Checkerboard, a novel, learning-free clean-label backdoor attack that efficiently poisons training data to compromise model integrity with minimal poisoning budget.

View →

cs.CRcs.LGRecentApr 7, 2026

Stealthy and Adjustable Text-Guided Backdoor Attacks on Multimodal Pretrained Models

Yiyang Zhang, Chaojian Yu, Ziming Hong, Yuanjie Shao +3 more

The paper proposes a novel Text-Guided Backdoor (TGB) attack that uses common words in text descriptions as stealthy triggers for multimodal models, enhancing practicality and controllability.

View →

cs.CLcs.AIcs.CRRecentMay 8, 2026

Activation Differences Reveal Backdoors: A Comparison of SAE Architectures

Sachin Kumar

The paper compares two sparse autoencoder architectures, finding that Differential SAEs (Diff-SAE) significantly outperform Crosscoders in isolating backdoor-related features in language models.

View →

cs.CRcs.AIRecentApr 10, 2026

BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

Guiyao Tie, Jiawen Shi, Pan Zhou, Lichao Sun

The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…

View →

cs.CRRecentMay 10, 2026

BadDLM: Backdooring Diffusion Language Models with Diverse Targets

Shengfang Zhai, Xiaoyang Ji, Yuling Shi, Haoran Gao +5 more

The paper introduces BadDLM, a unified framework that demonstrates a new class of backdoor vulnerabilities in Diffusion Language Models (DLMs) by exploiting their forward masking process across divers…

View →

cs.CRRecentApr 9, 2026

Follow My Eyes: Backdoor Attacks on VLM-based Scanpath Prediction

Diana Romero, Mutahar Ali, Momin Ahmad Khan, Habiba Farrukh +2 more

This paper introduces the first backdoor attacks against VLM-based scanpath prediction, demonstrating variable-output attacks that evade detection and survive deployment on edge devices.

View →