Papers similar to 2605.23448v1

~ similar to 2605.23448v1· 20 results

cs.CRcs.AIRecentMay 8, 2026

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more

The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.

View →

econ.THcs.CRcs.GTRecentMay 5, 2026

The Adversarial Discount -- AI, Signal Correlation, and the Cybersecurity Arms Race

James W. Bono

The paper models the cybersecurity arms race using a contest-theoretic framework, showing that full cross-correlation of threat intelligence can neutralize the attacker's structural advantage from inc…

View →

cs.CRRecentMay 28, 2026

Protecting On-Device AI Inference: A Systematic Review of Attacks and Defence Mechanisms

Zisis Tsiatsikas, Alexandros Fakis, Georgios Karopoulos, Vasileios Kouliaridis +1 more

This paper provides the first comprehensive review of threats and defenses specifically targeting on-device AI inference, revealing a significant imbalance where certain attack types, like adversarial…

View →

econ.GNcs.AIcs.CRRecentApr 24, 2026

The Security Cost of Intelligence: AI Capability, Cyber Risk, and Deployment Paradox

Sukwoong Choi

The paper models the trade-off between deploying increasingly capable AI systems and managing associated cyber risks, finding a 'deployment paradox' where high-loss environments with weak governance l…

View →

cs.CRRecentMay 26, 2026

Landseer: Exploring the Machine Learning Defense Landscape

Ayushi Sharma, Rosemary Agbozo, Santiago Torres-Arias, Zahra Ghodsi

The paper introduces Landseer, a modular framework designed to systematically evaluate and compose multiple machine learning defenses to address complex, real-world security requirements.

View →

cs.CRRecentMar 21, 2026

Unveiling the Security Risks of Federated Learning in the Wild: From Research to Practice

Jiahao Chen, Zhiming Zhao, Yuwen Pu, Chunyi Zhou +3 more

This paper argues that much of the existing research on Federated Learning (FL) security is based on idealized assumptions, and provides a practical evaluation framework showing that real-world attack…

View →

cs.CRcs.AIRecentMar 17, 2026

Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework

Taiwo Onitiju, Iman Vakilinia

The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…

View →

cs.CRcs.SERecentMay 13, 2026

Security Incentivization: An Empirical Study of how Micropayments Impact Code Security

Stefan Rass, Martin Pinzger, Rainer W. Alexandrowicz, Georg Sengstbratl +4 more

The paper demonstrates that linking team bonus points to measurable security improvements significantly reduces code security issues in a controlled educational experiment.

View →

cs.CRRecentMay 23, 2026

Reframing LLM Agent Security as an Agent-Human Interaction Problem

Peiran Wang, Ying Li, Yuan Tian

The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…

View →

cs.CRcs.AIcs.LGRecentMar 19, 2026

The Autonomy Tax: Defense Training Breaks LLM Agents

Shawn Li, Yue Zhao

Defense training for LLM agents, intended to improve safety, systematically degrades their core competence, leading to unreliability in multi-step tasks.

View →

cs.CRcs.AIRecentMay 6, 2026

Shattering the Echo Chamber: Hidden Safeguards in Manuscripts Against the AI Takeover of Peer Review

Oubo Ma, Ruixiao Lin, Jiahao Chen, Yuan Su +2 more

The paper proposes IntraGuard, a black-box, venue-agnostic defense framework that embeds hidden instructions into manuscripts via PDF structure to disrupt AI-generated peer reviews, achieving up to 84…

View →

cs.MAcs.CRcs.LGRecentApr 25, 2026

Architecture Matters for Multi-Agent Security

Ben Hagag, William L. Anderson, Christian Schroeder de Witt, Sarah Scheffler

This paper empirically demonstrates that the architectural design of multi-agent systems significantly impacts their security, finding that coordination mechanisms can introduce vulnerabilities greate…

View →

cs.CRcs.AIcs.CLRecentApr 6, 2026

Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities

Charafeddine Mouzouni

The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…

View →

cs.CRRecentJun 2, 2026

Operationalizing Cyber Attack Prediction: A Gap-Prioritized Framework with Dataset and Model Selection Guidelines

Aminu Muhammad Auwal

This paper proposes a gap-prioritization framework to bridge the gap between theoretical cyber attack prediction research and practical operational deployment by identifying critical implementation hu…

View →

cs.CRcs.AIRecentApr 10, 2026

Conflicts Make Large Reasoning Models Vulnerable to Attacks

Honghao Liu, Chengjin Xu, Xuhui Jiang, Cehao Yang +4 more

The paper demonstrates that confronting Large Reasoning Models (LRMs) with conflicting objectives, such as contradictory choices or conflicting alignment values, significantly increases their vulnerab…

View →

cs.CRcs.AIRecentMar 23, 2026

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more

This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…

View →

cs.CLcs.CRRecentApr 29, 2026

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

Yuan Xin, Yixuan Weng, Minjun Zhu, Ying Ling +4 more

The paper proposes SafeReview, a co-evolutionary adversarial training framework that significantly improves the robustness of LLM-based peer review systems against sophisticated adversarial hidden pro…

View →

cs.CRcs.AIRecentApr 9, 2026

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more

This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…

View →

cs.CRcs.AIcs.CLRecentJun 3, 2026

Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming

Nicholas Saban

The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…

View →

cs.LGcs.CRRecentMay 18, 2026

A No-Defense Defense Against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?

Mohamed elShehaby, Ashraf Matrawy

The paper demonstrates that simpler, shallower Deep Neural Network architectures with reduced features and ReLU activations can inherently improve the robustness of ML-NIDS against gradient-based adve…

View →