20 results for “Real-world deployments”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
The paper proposes a zero-trust supply-chain assurance rubric for O-RAN RIC applications to secure the entire lifecycle, from development to runtime.
This survey analyzes the field of On-device Learning (ODL) for TinyML by categorizing existing works based on how they address various types of post-deployment distribution changes.
The paper presents an approach to automatically generate a large number of diverse and complex cybersecurity scenarios that model enterprise IT systems for training purposes.
The paper proposes Operational AI Deployment Assurance (OADA), a governance framework that translates complex AI evaluation metrics and operational uncertainties into actionable, deployment-oriented a…
Ting Zhang, Yikun Li, Chengran Yang, Ratnadira Widyasari +14 more
TitanCA presents a novel, multi-agent LLM orchestration framework that significantly improves vulnerability discovery by reducing false positives and identifying numerous zero-day vulnerabilities.
Aakash Pant, Kavya Shah, Apoorv Agnihotri, Sneha Nikam +2 more
The paper critiques current AI benchmarking practices for low-resource settings, arguing that evaluation must shift focus from isolated model performance to the holistic performance of the deployed sy…
The paper proposes a novel, empirical methodology called 'backchaining' to derive and prioritize Loss of Control (LoC) mitigations by analyzing the errors an AI system makes on mission-specific nation…
The ACTING platform addresses the need for interoperable cyber-range training by providing a structured language (EDL-FG) for scenario description and automated evaluation mechanisms for complex, mult…
The paper proposes a novel semi-automated method to perform continuous threat modeling by inferring the actual system architecture from combined static configuration and dynamic network flow data, sig…
This paper analyzes the potential downsides of integrating advanced AI and smart capabilities across the Edge-Cloud continuum in modern industry, focusing specifically on security vulnerabilities, sid…
The paper introduces RealVuln, a benchmark that demonstrates a clear three-tier performance hierarchy for security scanners on real-world code, with specialized tools significantly outperforming gener…
The paper conducts an empirical evaluation of automated vulnerability detection tools across multiple software ecosystems using a curated ground-truth dataset derived from OSV, highlighting systematic…
Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more
This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…
The paper introduces MATRA, a systematic threat modeling framework, to assess how known LLM threats translate into concrete, deployment-specific risks within autonomous agentic AI systems.
The paper demonstrates that security patch detection models trained solely on publicly reported vulnerabilities (NVD) perform poorly when tested on real-world, unreported 'in-the-wild' patches, sugges…
DeepStage is a deep reinforcement learning framework that achieves autonomous, stage-aware defense against multi-stage APT campaigns by fusing graph-based telemetry and predicting attacker stages.
Adaptive Auto-Harness introduces a framework that enables LLM agents to sustain self-improvement and maintain high performance over open-ended, shifting task streams, outperforming existing fixed-benc…
The paper introduces ClawTrap, a MITM-based red-teaming framework, to evaluate the security robustness of web agents like OpenClaw against dynamic, real-world network attacks, finding that model stren…
DeltaMCP is a specification-aware, incremental regeneration tool that efficiently updates Model Context Protocol (MCP) servers by only modifying affected tooling when a service's OpenAPI specification…
Xixun Lin, Yang Liu, Yancheng Chen, Yongxuan Wu +7 more
The paper introduces SafeHarness, a novel, lifecycle-integrated security architecture that significantly reduces unsafe behavior and attack success rates in LLM agents by weaving multiple defense laye…