Papers similar to 2606.06013v1

~ similar to 2606.06013v1· 20 results

cs.CRcs.AIcs.CVRecentMay 11, 2026

BEACON: A Multimodal Dataset for Learning Behavioral Fingerprints from Gameplay Data

Ishpuneet Singh, Gursmeep Kaur, Uday Pratap Singh Atwal, Guramrit Singh +2 more

The paper introduces BEACON, a large-scale, multimodal dataset capturing diverse behavioral signals from competitive Valorant gameplay, designed for rigorous testing of continuous authentication and b…

View →

cs.GTcs.AIcs.CRRecentMay 14, 2026

Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games

Juho Kim, Fei Fang, Tuomas Sandholm

This paper adapts LLM watermarking techniques, specifically the KGW watermark, to create detectable watermarks for AI game-playing strategies in perfect-information games, showing minimal impact on ga…

View →

cs.CRcs.CYcs.LGRecentApr 11, 2026

"bot lane noob" Towards Deployment of NLP-based Toxicity Detectors in Video Games

Jonas Ave, Irdin Pekaric, Matthias Frohner, Giovanni Apruzzese

This paper addresses the lack of specialized NLP tools for detecting toxicity in real-time video game chat by creating a large, fine-grained dataset and developing a superior, domain-specific detector…

View →

cs.CRcs.AIRecentMay 29, 2026

Stateful Online Monitoring Catches Distributed Agent Attacks

Davis Brown, Samarth Bhargav, Arav Santhanam, Kasper Hong +6 more

The paper introduces a novel stateful online monitoring system that detects distributed multi-agent cyberattacks by aggregating weak suspiciousness signals across many user accounts, overcoming the bl…

View →

cs.CRcs.AIRecentMay 29, 2026

Stateful Online Monitoring Catches Distributed Agent Attacks

Davis Brown, Samarth Bhargav, Arav Santhanam, Kasper Hong +6 more

View →

cs.CRcs.HCRecentMay 14, 2026

Analyzing Codes of Conduct for Online Safety in Video Games at Scale

Jiuming Jiang, Shidong Pan, Daniel W Woods, Jingjie Li

The paper analyzes Codes of Conduct (CoCs) for online video games using a novel pipeline, finding that most multiplayer games lack CoCs despite safety needs, and that CoCs often lack specificity regar…

View →

cs.CRcs.AIRecentMay 10, 2026

MonitoringBench: Semi-Automated Red-Teaming for Agent Monitoring

Monika Jotautaitė, Maria Angelica Martinez, Ollie Matthews, Tyler Tracy

The paper introduces MonitoringBench, a semi-automated red-teaming methodology that generates diverse and stronger attacks, revealing that current coding-agent monitors often fail against sophisticate…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…

View →

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…

View →

cs.AIcs.CRRecentMay 12, 2026

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Hao Wang, Hanchen Li, Qiuyang Mang, Alvin Cheung +2 more

The paper introduces BenchJack, an automated red-teaming system that systematically audits popular AI agent benchmarks, revealing numerous reward-hacking exploits and demonstrating a method to signifi…

View →

cs.CRcs.AIcs.CLRecentMay 12, 2026

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more

The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…

View →

cs.CLRecentJun 1, 2026

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou +7 more

The paper introduces SkillHarm, a comprehensive benchmark and automated framework for evaluating skill-based attacks across the entire agent skill-use lifecycle, demonstrating that current agents rema…

View →

cs.CRRecentMay 15, 2026

STRIKE: A Structured Taxonomy of Cybercrime for Risk, Impact, Knowledge, and Evolution

Melissa Pappy, Linh Nguyen, Suman Kumar, Byungkwan Jung +1 more

The paper introduces STRIKE, a multi-dimensional structured taxonomy designed to provide a comprehensive and unified framework for classifying the rapidly evolving complexity of modern cybercrimes.

View →

cs.CRRecentApr 7, 2026

SoK: Understanding Anti-Forensics Concepts and Research Practices Across Forensic Subdomains

Janine Schneider, Florian Ramming, Maximilian Eichhorn, Gaston Pugliese +8 more

This paper systematically analyzes 123 publications on anti-forensics to quantify techniques and attack vectors, identify research patterns, and propose directions for a more coherent and ethical unde…

View →

cs.AIcs.CRRecentMay 5, 2026

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers

The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…

View →

cs.LGcs.AIcs.CLRecentMay 28, 2026

Generative AI and Digital Ecosystem Resilience: A Proactive Lifecycle-Based Survey

Jonghyun Chung, Rishabh Chaddha, Sanket Badhe, Debanshu Das +2 more

This survey proposes a proactive, lifecycle-based framework, utilizing the C5 Interaction Model, to detect emerging adversarial synthetic narratives generated by GenAI, moving beyond traditional react…

View →

cs.LGcs.AIcs.CLRecentMay 28, 2026

Generative AI and Digital Ecosystem Resilience: A Proactive Lifecycle-Based Survey

Jonghyun Chung, Rishabh Chaddha, Sanket Badhe, Debanshu Das +2 more

This survey proposes a proactive, lifecycle-based framework, utilizing the C5 Interaction Model, to detect emerging adversarial synthetic narratives generated by Generative AI, moving beyond tradition…

View →

cs.CRcs.LGRecentMay 25, 2026

Building an Adversarial Malware Dataset by Family and Type: Generation, Evasion, and Poisoning Evaluation

David Košťál, Martin Jureček

The paper constructs a large, adversarial malware dataset from real-world binaries, demonstrating high evasion rates and showing that even small amounts of poisoned data can severely compromise malwar…

View →

cs.CRRecentApr 13, 2026

RedShell: A Generative AI-Based Approach to Ethical Hacking

Ricardo Bessa, Rui Claro, João Trindade, João Lourenço

The paper introduces RedShell, a generative AI tool designed to help ethical hackers generate syntactically and semantically valid malicious PowerShell code, addressing the challenge of data scarcity…

View →

cs.CRcs.AIcs.NIRecentApr 5, 2026

NetSecBed: A Container-Native Testbed for Reproducible Cybersecurity Experimentation

Leonardo Bitzki, Diego Kreutz, Tiago Heinrich, Douglas Fideles +3 more

NetSecBed is a container-native, scenario-oriented testbed designed to generate reproducible and auditable network traffic evidence and execution artifacts for complex cybersecurity research.

View →