"Real-world deployments" | ArxivCSExplorer

20 results for “Real-world deployments”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.CRcs.NIRecentMay 5, 2026

Towards a Zero-Trust Supply-Chain Assurance Rubric for ORAN RIC Applications

The paper proposes a zero-trust supply-chain assurance rubric for O-RAN RIC applications to secure the entire lifecycle, from development to runtime.

View →

cs.LGcs.AIRecentMay 29, 2026

What changes after deployment? A survey on On-device Learning in TinyML

Massimo Pavan, Luca Pezzarossa, Fabrizio Pittorino, Manuel Roveri +1 more

This survey analyzes the field of On-device Learning (ODL) for TinyML by categorizing existing works based on how they address various types of post-deployment distribution changes.

View →

cs.CRcs.SERecentApr 1, 2026

Automated Generation of Cybersecurity Exercise Scenarios

Charilaos Skandylas, Mikael Asplund

The paper presents an approach to automatically generate a large number of diverse and complex cybersecurity scenarios that model enterprise IT systems for training purposes.

View →

cs.AIcs.CYRecentMay 27, 2026

Operational AI Deployment Assurance: Governance-State Orchestration Under Threshold-Sensitive Deployment Conditions -- A Governance Framework for High-Stakes AI Systems

Khalid Adnan Alsayed

The paper proposes Operational AI Deployment Assurance (OADA), a governance framework that translates complex AI evaluation metrics and operational uncertainties into actionable, deployment-oriented a…

View →

cs.CRRecentApr 20, 2026

TitanCA: Lessons from Orchestrating LLM Agents to Discover 100+ CVEs

Ting Zhang, Yikun Li, Chengran Yang, Ratnadira Widyasari +14 more

TitanCA presents a novel, multi-agent LLM orchestration framework that significantly improves vulnerability discovery by reducing false positives and identifying numerous zero-day vulnerabilities.

View →

cs.AIRecentMay 27, 2026

Benchmarking AI for low-resource contexts: Thinking beyond leaderboards

Aakash Pant, Kavya Shah, Apoorv Agnihotri, Sneha Nikam +2 more

The paper critiques current AI benchmarking practices for low-resource settings, arguing that evaluation must shift focus from isolated model performance to the holistic performance of the deployed sy…

View →

cs.CYcs.CRRecentMay 20, 2026

Backchaining Loss of Control Mitigations from Mission-Specific Benchmarks in National Security

Matteo Pistillo, Samantha Faraone, Joshua Herman

The paper proposes a novel, empirical methodology called 'backchaining' to derive and prioritize Loss of Control (LoC) mitigations by analyzing the errors an AI system makes on mission-specific nation…

View →

cs.CRRecentMay 12, 2026

ACTING: A Platform for Cyber Ranges Federation

Kyriakos Christou, Maria Michalopoulou, Stefano Taggi, Matteo Merialdo +20 more

The ACTING platform addresses the need for interoperable cyber-range training by providing a structured language (EDL-FG) for scenario description and automated evaluation mechanisms for complex, mult…

View →

cs.CRRecentMar 23, 2026

Semi-Automated Threat Modeling of Cloud-Based Systems Through Extracting Software Architecture from Configuration and Network Flow

Nicholas Pecka, Lotfi Ben Othmane, Bharat Bhargava, Renee Bryce

The paper proposes a novel semi-automated method to perform continuous threat modeling by inferring the actual system architecture from combined static configuration and dynamic network flow data, sig…

View →

cs.CRcs.AIcs.DCRecentMar 31, 2026

Downsides of Smartness Across Edge-Cloud Continuum in Modern Industry

Akhil Gupta Chigullapally, Sharvan Vittala, Razin Farhan Hussian, Mohsen Amini Salehi

This paper analyzes the potential downsides of integrating advanced AI and smart capabilities across the Edge-Cloud continuum in modern industry, focusing specifically on security vulnerabilities, sid…

View →

cs.CRRecentApr 15, 2026

RealVuln: Benchmarking Rule-Based, General-Purpose LLM, and Security-Specialized Scanners on Real-World Code

John Pellew, Faizan Raza

The paper introduces RealVuln, a benchmark that demonstrates a clear three-tier performance hierarchy for security scanners on real-world code, with specialized tools significantly outperforming gener…

View →

cs.SEcs.CRRecentApr 22, 2026

A Ground-Truth-Based Evaluation of Vulnerability Detection Across Multiple Ecosystems

Peter Mandl, Paul Mandl, Martin Häusl, Maximilian Auch

The paper conducts an empirical evaluation of automated vulnerability detection tools across multiple software ecosystems using a curated ground-truth dataset derived from OSV, highlighting systematic…

View →

cs.CRRecentMay 19, 2026

Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection

Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more

This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…

View →

cs.AIcs.CRRecentMay 11, 2026

MATRA: Modeling the Attack Surface of Agentic AI Systems -- OpenClaw Case Study

Tim Van hamme, Thomas Vissers, Javier Carnerero-Cano, Mario Fritz +3 more

The paper introduces MATRA, a systematic threat modeling framework, to assess how known LLM threats translate into concrete, deployment-specific risks within autonomous agentic AI systems.

View →

cs.SEcs.CRRecentMar 18, 2026

Revisiting Vulnerability Patch Identification on Data in the Wild

Ivana Clairine Irsan, Ratnadira Widyasari, Ting Zhang, Huihui Huang +6 more

The paper demonstrates that security patch detection models trained solely on publicly reported vulnerabilities (NVD) perform poorly when tested on real-world, unreported 'in-the-wild' patches, sugges…

View →

cs.CRcs.AIcs.LGRecentMar 17, 2026

DeepStage: Learning Autonomous Defense Policies Against Multi-Stage APT Campaigns

Trung V. Phan, Tri Gia Nguyen, Thomas Bauschert

DeepStage is a deep reinforcement learning framework that achieves autonomous, stage-aware defense against multi-stage APT campaigns by fusing graph-based telemetry and predicting attacker stages.

View →

cs.LGcs.AIRecentJun 1, 2026

Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams

Zewen Liu, Zhan Shi, Yisi Sang, Bing He +6 more

Adaptive Auto-Harness introduces a framework that enables LLM agents to sustain self-improvement and maintain high performance over open-ended, shifting task streams, outperforming existing fixed-benc…

View →

cs.CRcs.AIRecentMar 19, 2026

ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation

Haochen Zhao, Shaoyang Cui

The paper introduces ClawTrap, a MITM-based red-teaming framework, to evaluate the security robustness of web agents like OpenClaw against dynamic, real-world network attacks, finding that model stren…

View →

cs.SEcs.AIRecentMay 27, 2026

DeltaMCP: Incremental Regeneration via Spec-Aware Transformation for MCP servers

Aditya Pujara, Xiaogang Zhu, Hsiang-Ting Chen

DeltaMCP is a specification-aware, incremental regeneration tool that efficiently updates Model Context Protocol (MCP) servers by only modifying affected tooling when a service's OpenAPI specification…

View →

cs.CRcs.AIRecentApr 15, 2026

SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment

Xixun Lin, Yang Liu, Yancheng Chen, Yongxuan Wu +7 more

The paper introduces SafeHarness, a novel, lifecycle-integrated security architecture that significantly reduces unsafe behavior and attack success rates in LLM agents by weaving multiple defense laye…

View →