This survey provides a comprehensive, practical guide to ensuring the trustworthiness of complex, autonomous agentic AI systems by focusing on safety, robustness, privacy, and system security.
Agentic AI systems -- Large Language Models (LLMs) augmented with planning, tool use, memory, and long-horizon interactions -- can execute complex tasks autonomously, but their multi-step trajectories introduce new failure modes that challenge trustworthiness. This survey provides a focused examination of trustworthy agentic AI through two core dimensions that are critical for high-risk deployments: Safety and Robustness, and Privacy and System Security. For each dimension, we clarify key concepts, identify where risks emerge along the agent workflow, and summarize stage-targeted mitigation strategies. Other trustworthiness aspects (value alignment, transparency, fairness, and accountability) are discussed as relevant context rather than parallel chapters. To support consistent comparison and deployment decisions, we consolidate evaluation into a unified metrics-and-benchmarks hub, emphasizing both outcome and process signals (e.g., constraint violations, trace completeness, and adversarial success rates) and offering scenario-to-metric guidance for release gating. We conclude by outlining open challenges such as self-evolving agents, runtime monitoring and verification, privacy-preserving personalization, and the trust-utility trade-off, and present a case study of real-world security failures in open-source agentic systems. Our goal is to serve as a practical reference for researchers and practitioners building trustworthy agentic systems in high-stakes environments.
SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy
This paper systematically maps the expanded attack surface of agentic AI systems…
Clawed and Dangerous: Can We Trust Open Agentic Systems?
This paper systematizes the security challenges of open agentic systems, conclud…
Security Barriers to Trustworthy AI-Driven Cyber Threat Intelligence in Finance: Evidence from Pract…
This paper investigates the practical barriers preventing the trustworthy deploy…
Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisi…
This paper reviews recent EU AI regulatory documents to clarify definitions and…
Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Bench…
This paper provides the first comprehensive, end-to-end survey dedicated to the…
AI-Assisted Hardware Security Verification: A Survey and AI Accelerator Case Study
This survey reviews the integration of AI and LLMs into hardware security verifi…
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses
This survey provides a comprehensive, structured review of safety research in Em…
Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis
The paper introduces an execution-grounded, cross-language framework that signif…