~ similar to 2605.16407v1· 20 results
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…
The paper introduces MolTrust, a production-deployed trust infrastructure built on W3C standards (VCs and DIDs) that provides a verifiable, multi-layered authorization framework for autonomous AI agen…
LLM-FACETS introduces an open-source, privacy-preserving framework designed to enable non-technical domain experts and compliance officers to audit and evaluate the transparency and accountability of…
The paper introduces CyberCertBench, a new benchmark suite for evaluating LLMs against industry cybersecurity certifications, finding that while frontier models perform well on general knowledge, thei…
Pramana introduces a standardized, protocol-level wire format for autonomous agent outputs, ensuring that every consequential claim is accompanied by a verifiable artifact that can be re-executed by a…
The paper proposes an evidence-driven protocol combining Deterministic Build Systems and Trusted Execution Environments to provide cryptographically verifiable guarantees of software artifact integrit…
The paper proposes a TEE-based architecture that enables external, auditable verification of AI-assisted grant evaluations without exposing the proprietary model, scoring logic, or intermediate reason…
Tingda Shen, Yebo Feng, Konglin Zhu, Xiaojun Jia +2 more
The paper introduces SIGIL, a novel framework that cryptographically seals the entire lifecycle of LLM skills, ensuring verifiable integrity from publication through runtime execution to prevent suppl…
The paper proposes a federated formal verification architecture that treats verification as a polyglot proof system, successfully validating it on complex production subsystems like a Raft consensus m…
The paper introduces a systematic, executable taxonomy of security properties to bridge the gap between theoretical security definitions and their practical implementation in formal verification tools…
Aiman Al Masoud, Antony Anju, Marco Arazzi, Mert Cihangiroglu +5 more
This paper provides the first comprehensive Systematization of Knowledge (SoK) on the security aspects of LLM-as-a-Judge (LaaJ) systems, identifying key vulnerabilities and proposing a taxonomy for fu…
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
MemLineage introduces a novel, cryptographically-backed defense mechanism that enforces a chain-of-custody for LLM agent memory, preventing untrusted or poisoned state from justifying sensitive action…
The paper introduces an operational post-quantum X.509 assurance framework that rigorously validates ML-KEM and ML-DSA certificates and keys across various deployment stages, achieving comprehensive d…
The paper proposes Agentic Witnessing, a TEE-enabled framework that allows external verifiers to audit the qualitative properties of private datasets by querying an LLM-based auditor without accessing…
The paper proposes a Semantic Gateway and a Zero-Trust security model to formally validate and secure autonomous AI agents operating in enterprise systems, achieving a 100% discovery rate of unauthori…
The paper introduces the Lean-Agent Protocol, a formal verification platform that uses Lean 4 theorem proving to ensure agentic AI actions in finance are mathematically compliant with complex regulati…
The paper introduces an efficient, lightweight LLM framework for smart contract auditing that decouples the audit process into multiple components, achieving high accuracy while significantly reducing…
The paper introduces Neuroforger, a system that combines a new formal specification language with LLMs and type checking to reliably generate and validate concrete violation witnesses (counterexamples…