~ similar to 2605.06508v1· 20 results
This paper replicates and extends a study on Java security API misuse in LLMs, finding that while newer models improve performance, the misuse risk persists and is significantly mitigated by external…
The paper introduces NICE, a declarative framework that uses NixOS to build and automatically validate reproducible environments for demonstrating software vulnerabilities (CVEs), thereby improving th…
Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more
This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…
The paper proposes using Trusted-Execution Environments (TEEs) to create a scalable, privacy-preserving system where authors can submit cryptographic proofs of correct research replication, thereby ad…
Zehra Karadağ, Simon Klix, René Walendy, Felix Hahn +4 more
This paper systematizes two decades of hardware reverse engineering research by analyzing 187 publications, identifying key technical methods and recommending improvements for reproducibility, standar…
Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more
The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…
This paper conducts a large-scale, repository-aware security analysis of AI agent skills, demonstrating that incorporating surrounding project context drastically reduces the rate of false positive ma…
Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more
This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…
Jan Pennekamp, Johannes Lohmöller, David Schütte, Joscha Loos +1 more
This paper systematically analyzes 2.7 million arXiv submissions to demonstrate that nearly every preprint unintentionally discloses sensitive or unnecessary information through its source files, prop…
The paper argues that current Software Bills of Materials (SBOMs) are fundamentally flawed due to a lack of shared understanding regarding what constitutes a 'component,' demonstrating that existing t…
Zheng-Xin Yong, Parv Mahajan, Andy Wang, Ida Caspary +11 more
The paper conducts a preliminary safety evaluation of the open-weight LLM Kimi K2.5, finding that while it is highly capable, it exhibits concerning dual-use risks, particularly regarding CBRNE misuse…
The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
The paper proposes an evidence-driven protocol combining Deterministic Build Systems and Trusted Execution Environments to provide cryptographically verifiable guarantees of software artifact integrit…
This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…
The paper conducts an empirical evaluation of automated vulnerability detection tools across multiple software ecosystems using a curated ground-truth dataset derived from OSV, highlighting systematic…
Sven Peldszus, Frederik Reiche, Kevin Hermann, Sophie Corallo +2 more
The paper maps 66 security design DSLs to 559 code-level analyzer checks to quantify the challenging relationship between high-level security design and low-level implementation vulnerabilities, revea…
Leon Schuermann, Brad Campbell, Branden Ghena, Philip Levis +2 more
This paper analyzes the impact of Tock's secure technical design, built using Rust and hardware protection, on its successful transition from academic research to a widely adopted, production-grade op…
The paper proposes GCVE, a decentralized, open, and extensible socio-technical model to standardize and enrich the entire lifecycle of vulnerability information, moving beyond simple identifier alloca…
This paper proposes an empirical methodology to automate web application trustworthiness assessment by leveraging Large Language Models (LLMs) to verify adherence to secure coding practices, showing t…