~ similar to 2604.02617v1· 20 results
Pramana introduces a standardized, protocol-level wire format for autonomous agent outputs, ensuring that every consequential claim is accompanied by a verifiable artifact that can be re-executed by a…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
The paper introduces Neuroforger, a system that combines a new formal specification language with LLMs and type checking to reliably generate and validate concrete violation witnesses (counterexamples…
Jinhu Qi, Muzhi Li, Jiahong Liu, Yuqin Shu +8 more
This survey provides a comprehensive, practical guide to ensuring the trustworthiness of complex, autonomous agentic AI systems by focusing on safety, robustness, privacy, and system security.
Yutong Cheng, Changze Li, Raihan Sultan Pasha Basuki, Qian Cui +2 more
TTPrint proposes a novel diverge-then-converge framework for extracting MITRE ATT&CK techniques from CTI reports, significantly improving both recall and precision compared to existing methods.
The paper proposes Agentic Witnessing, a TEE-enabled framework that allows external verifiers to audit the qualitative properties of private datasets by querying an LLM-based auditor without accessing…
LLM-FACETS introduces an open-source, privacy-preserving framework designed to enable non-technical domain experts and compliance officers to audit and evaluate the transparency and accountability of…
FineVerify introduces a fine-grained self-verification framework that improves agentic search by decomposing complex questions into verifiable sub-questions, leading to significant accuracy gains over…
Yuxi Sun, Wenbo Shang, Wei Gao, Xin Huang +1 more
The paper introduces a diagnostic testbed, PAVE, to evaluate how LLMs arbitrate between their internal knowledge and retrieved evidence during fact-checking, revealing that this arbitration is unrelia…
The paper proposes a trust-boundary architecture using Lean 4 to verify the deterministic structured computations surrounding LLM pipelines, providing verifiable certificates for high-stakes deploymen…
The paper introduces an agentic workflow that uses large language models (LLMs) combined with structured querying and constrained tools to automate and significantly improve the accuracy of initial se…
This paper addresses the critical need for trustworthy LLMs in science by proposing a comprehensive, multi-layered defense framework and methodology to evaluate unique scientific vulnerabilities.
This paper benchmarks LLMs for smart contract security analysis, concluding that while LLMs show potential, their reliability is limited by lexical bias and requires integration with traditional stati…
The paper introduces an execution-grounded, cross-language framework that significantly improves the reliability of LLM-driven code vulnerability analysis by ensuring that all proposed fixes are confi…
The paper introduces MolTrust, a production-deployed trust infrastructure built on W3C standards (VCs and DIDs) that provides a verifiable, multi-layered authorization framework for autonomous AI agen…
The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…
The paper introduces Auto-ART, a comprehensive open-source framework that provides structured meta-analysis and automated testing for adversarial robustness, revealing significant gaps in current ML s…
ResearchLoop introduces an evidence-gated control plane to manage and audit the state of AI-assisted computational research, mitigating the risk of unverified claims.
The paper defines AI Identity as the correspondence between an agent's declared state and its observed behavior, concluding that current infrastructure and standards are fundamentally inadequate for g…
The paper introduces SciIntBench, an adversarial benchmark that reveals that LLMs' adherence to research integrity norms is highly sensitive to how the misconduct is framed, often failing when the mis…