Papers similar to 2603.28043v1

~ similar to 2603.28043v1· 20 results

cs.CRcs.AIcs.CLRecentApr 6, 2026

Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities

The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…

View →

cs.CRstat.APRecentMay 8, 2026

Combating Organized Platform Abuse: Amplifying Weak Risk Signals with Structural Information

Meng He, Jia Long Loh

The paper proposes a novel structural invariant approach, derived from the economic constraints of fraud, that amplifies weak, low-precision signals into highly accurate fraud detections without requi…

View →

cs.CRRecentMay 14, 2026

Topical Shifts in the Dark Web: A Longitudinal Analysis of Content from the Cybercrime Ecosystem

Roy Ricaldi, Maximilian Schafer, Philipp Zech, Luca Allodi +2 more

This study provides a longitudinal analysis of dark web content, revealing that cybercrime discussions are dominated by a few persistent core topics rather than rapidly shifting themes.

View →

cs.CRcs.AIcs.LGRecentMay 11, 2026

Content-Aware Attack Detection in LLM Agent Tool-Call Traffic: An Empirical Study of Features, Architectures, and Evaluation Protocols

Sultan Zavrak

The paper proposes a graph-based framework for detecting attacks in LLM agent tool-call traffic, finding that content-level embeddings are crucial for high accuracy and that tree ensembles on these em…

View →

cs.CRcs.CLRecentMay 26, 2026

BAIT: Boundary-Guided Disclosure Escalation via Self-Conditioned Reasoning

Xuan Luo, Yue Wang, Geng Tu, Jing Li +1 more

The paper introduces BAIT, a three-step jailbreak framework that systematically forces large language models to disclose harmful information by leveraging their internal reasoning and consistency tend…

View →

cs.CRcs.AIRecentMar 17, 2026

Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems

Patrick Levi

The paper proposes an unsupervised method using multiple statistical indicators to detect adversarial or compromised context documents in Retrieval Augmented Generation (RAG) systems, even without kno…

View →

cs.CRRecentApr 29, 2026

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

Soheil Khodayari, Xuenan Zhang, Bhupendra Acharya, Giancarlo Pellegrino

This paper provides a large-scale empirical analysis of indirect prompt injections found in webpages, revealing that prompt-based interference is a widespread, persistent, and growing threat targeting…

View →

cs.CRcs.SIRecentApr 20, 2026

SoK: Analysis of Privacy Risks and Mitigation in Online Propaganda Detection through the PROMPT Framework

Dhiman Goswami, Al Nahian Bin Emran, Md Hasan Ullah Sadi, Sanchari Das

The paper introduces the PROMPT framework to systematically analyze and mitigate privacy risks in online propaganda detection pipelines, demonstrating that current widely used methods are often non-co…

View →

cs.CRcs.CYRecentMay 8, 2026

Binge, Bot, Repeat: Unpacking the Ecosystem of Video Piracy on Telegram

Sadikshya Gyawali, Jaishnoor Kaur, Taylor Graham, Josef Horacek +3 more

This study provides the first large-scale analysis of video piracy on Telegram, quantifying its massive financial impact and developing a resilient detection framework, Anti-RIP, to combat it.

View →

cs.CRcs.LGRecentMay 5, 2026

Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering

Tejas Kulkarni, Antti Koskela, Laith Zumot

This paper demonstrates that retrieval-augmented in-context learning systems for document QA are vulnerable to membership inference attacks, proposing novel black-box methods that exploit query prefix…

View →

cs.CRRecentMay 18, 2026

From Detection to Response: A Deep Learning and Retrieval-Augmented Generation Framework for Network Intrusion Mitigation

Md Navid Bin Islam, Sajal Saha, Senior Member

The paper introduces an end-to-end framework that not only detects network intrusions using deep learning but also generates actionable, citation-grounded mitigation reports using a Retrieval-Augmente…

View →

cs.CLcs.CRcs.LGRecentMar 29, 2026

Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong +1 more

The paper introduces Hidden Ads, a novel backdoor attack for Vision-Language Models (VLMs) that injects unauthorized advertisements by exploiting natural, recommendation-seeking user behaviors, mainta…

View →

cs.CRcs.AIcs.LGRecentMar 28, 2026

Sovereign Context Protocol: An Open Attribution Layer for Human-Generated Content in the Age of Large Language Models

Praneel Panchigar, Torlach Rush, Matthew Canabarro

The paper introduces the Sovereign Context Protocol (SCP), an open-source, attribution-aware data access layer designed to standardize how Large Language Models (LLMs) connect to and track usage of hu…

View →

cs.CRcs.AIRecentMar 25, 2026

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Yulin Shen, Xudong Pan, Geng Hong, Min Yang

The paper introduces Tree structured Injection for Payloads (TIP), a novel black-box attack framework that reliably generates stealthy injection payloads to seize control of LLM agents utilizing the M…

View →

cs.CRRecentJun 3, 2026

TeleHunt: A Framework and Tool for Efficient Cybercriminal Community Discovery on Telegram

Roy Ricaldi, Victor Asanache, Luca Allodi

The paper introduces TeleHunt, a comprehensive framework and tool that systematically evaluates various strategies for efficiently discovering cybercriminal communities operating on Telegram.

View →

cs.CRcs.CLRecentJun 4, 2026

An Embarrassingly Simple Detector for Model Extraction Attacks in Large Language Model API Traffic

Shuze Liu, Qianwen Guo, Yushun Dong

The paper proposes an embarrassingly simple detector that monitors model extraction attacks by testing whether the aggregate distribution of incoming LLM queries deviates from the historical distribut…

View →

cs.CRRecentApr 21, 2026

Involuntary In-Context Learning: Exploiting Few-Shot Pattern Completion to Bypass Safety Alignment in GPT-5.4

Alex Polyakov, Daniel Kuznetsov

The paper introduces Involuntary In-Context Learning (IICL), an effective few-shot pattern completion attack that can bypass safety alignments in large language models, achieving a 24.0% bypass rate a…

View →

cs.CRcs.LGRecentMay 18, 2026

Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection

Juozas Dautartas, Olga Kurasova, Juozapas Rokas Čypas, Viktor Medvedev

The paper proposes a framework to intentionally evade malware detectors by adding a small number of benign API imports, successfully demonstrating targeted misclassification into a chosen benign categ…

View →

cs.CRRecentMay 4, 2026

PHANTOM: Polymorphic Honeytoken Adaptation with Narrative-Tailored Organisational Mimicry

Abraham Itzhak Weinberg

PHANTOM is a novel framework that generates highly convincing, context-aware honeytokens by incorporating deep organizational knowledge, significantly improving their believability and detection resis…

View →

cs.CRRecentMay 11, 2026

Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data

Elham Pourabbas Vafa, Sayak Saha Roy, Shirin Nilizadeh

The paper demonstrates that generative AI can automate and scale highly personalized, context-aware spear-phishing attacks using only public social media data, resulting in messages that are significa…

View →