Papers similar to 2605.08257v1

~ similar to 2605.08257v1· 20 results

cs.AIRecentMay 27, 2026

SafeMed-R1: Clinician-Audited Safety and Ethics Alignment for Medical Large Language Models

Chao Ding, Mouxiao Bian, Tianbin Li, Minjia Yuan +11 more

The paper introduces SafeMed-R1, a clinically audited LLM that significantly improves safety and ethical alignment for medical applications, matching or exceeding resident performance on safety-critic…

View →

cs.CRRecentMay 28, 2026

SAMD: A Tool for Identifying False Data Injection Scenarios in AI/ML-enabled Medical Devices

Mohammadreza Hallajiyan, Xueren Ge, Athish Pranav Dharmalingam, Gargi Mitra +3 more

The paper introduces SAMD, an automated tool that uses STPA-Sec to identify potential false data injection attack scenarios in AI/ML-enabled medical devices during the design phase.

View →

cs.CLcs.AIRecentMay 28, 2026

SURGENT: A Surgical Multi-Agent Assistance System Across the Perioperative Workflow

Dongsheng Shi, Yue Li, Xin Yi, Yongyi Cui +2 more

The paper introduces SURGENT, a multi-agent assistance system designed for the entire perioperative workflow, which outperforms standard LLMs by providing context-aware, traceable, and privacy-preserv…

View →

cs.AIcs.CRRecentMay 15, 2026

GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction

Liangyi Huang, Zichen Liu, Fei Shao, Shang Ma +4 more

The paper introduces GRID, an end-to-end framework that significantly improves the construction of security knowledge graphs from cyber threat intelligence by replacing unstable LLM-based supervision…

View →

cs.AIRecentJun 1, 2026

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Junqi Liu, Salena Song, Yuhan Wang, Jiawei Mao +11 more

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validatio…

View →

cs.CRcs.AIcs.CLRecentMay 1, 2026

When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI

Alfredo Madrid-García, Miguel Rujas

This paper demonstrates that patient-facing RAG chatbots frequently expose sensitive system configurations, knowledge base details, and conversation history through client-server communication, posing…

View →

cs.CRcs.AIRecentMar 17, 2026

Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework

Taiwo Onitiju, Iman Vakilinia

The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…

View →

cs.AIcs.LGRecentMay 30, 2026

Medication-Aware Financial Exploitation Detection for Alzheimer's Patients Using Edge-Aware Interaction Risk Modeling

Farzana Akter, Lisan Al Amin, Rakib Hossain, Chaitanya Gunupudi +1 more

The paper proposes a medication-aware framework that integrates medication adherence with financial transaction monitoring to significantly improve the detection of financial exploitation in Alzheimer…

View →

cs.CRcs.AIRecentMay 11, 2026

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

Peiru Yang, Haoran Zheng, Tong Ju, Shiting Wang +5 more

The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attac…

View →

cs.CRRecentMar 24, 2026

Security Barriers to Trustworthy AI-Driven Cyber Threat Intelligence in Finance: Evidence from Practitioners

Emir Karaosman, Advije Rizvani, Irdin Pekaric

This paper investigates the practical barriers preventing the trustworthy deployment of AI-driven Cyber Threat Intelligence (CTI) in the highly regulated financial sector, identifying four key socio-t…

View →

cs.CRRecentMay 7, 2026

Autonomous Adversary: Red-Teaming in the age of LLM

Mohammad Mamun, Mohamed Gaber, Scott Buffett, Sherif Saad

The paper evaluates Language Model Agents (LMAs) for red-teaming by benchmarking their ability to perform lateral movement, finding that expert-defined action plans are most effective, though all moda…

View →

cs.CRcs.AIRecentMay 7, 2026

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

Zhe Liu, Zonghao Ying, Wenxin Zhang, Quanchen Zou +4 more

SafeHarbor is a novel, hierarchical memory-augmented framework that establishes context-aware decision boundaries for LLM agents, achieving state-of-the-art safety while minimizing over-refusal.

View →

cs.CRcs.AIcs.DBRecentMay 31, 2026

Inference Cost Attacks for Retrieval-Augmented Large Language Models

Chengliang Liu, Liangbo Ning, Yujuan Ding, Wenqi Fan

This paper introduces a novel attack, RA-ICA, that targets RAG-enhanced LLMs by poisoning external knowledge bases to drastically increase inference costs, achieving up to a 13.12x increase in token c…

View →

cs.AIcs.CLcs.CRRecentMay 17, 2026

Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security

Jinhu Qi, Muzhi Li, Jiahong Liu, Yuqin Shu +8 more

This survey provides a comprehensive, practical guide to ensuring the trustworthiness of complex, autonomous agentic AI systems by focusing on safety, robustness, privacy, and system security.

View →

cs.CRcs.AIRecentApr 28, 2026

Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills

Lijia Lv, Xuehai Tang, Jie Wen, Jizhong Han +1 more

The paper introduces SkillGuard-Robust, a novel framework for robust, cross-file security auditing of untrusted agent skills, achieving high accuracy on large-scale package evaluations.

View →

cs.CRcs.AIRecentMar 18, 2026

Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare

Saikat Maiti

The paper proposes and validates a comprehensive four-layer Zero Trust security architecture designed to mitigate critical vulnerabilities in autonomous AI agents handling Protected Health Information…

View →

cs.AIcs.CLcs.LGRecentMay 28, 2026

Why Specialist Models Still Matter: A Heterogeneous Multi-Agent Paradigm for Medical Artificial Intelligence

Yanan Wang, Shuaicong Hu, Jian Liu, Guohui Zhou +2 more

The paper proposes HetMedAgent, a multi-agent framework, demonstrating that combining generalist LLMs with domain-specific specialist models significantly improves medical AI performance by enabling s…

View →

cs.CRcs.CVRecentMar 18, 2026

Toward Reliable, Safe, and Secure LLMs for Scientific Applications

Saket Sanjeev Chaturvedi, Joshua Bergerson, Tanwi Mallick

This paper addresses the critical need for trustworthy LLMs in science by proposing a comprehensive, multi-layered defense framework and methodology to evaluate unique scientific vulnerabilities.

View →

cs.CRcs.LGcs.MARecentApr 6, 2026

Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement Learning

Yiyao Zhang, Diksha Goel, Hussain Ahmad

The paper introduces C-MADF, a causally constrained multi-agent framework that significantly reduces false positives in autonomous cyber defense by restricting response actions to structurally consist…

View →

cs.CRcs.LGcs.MARecentMay 1, 2026

When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems

Lingxi Zhang, Guangtao Zheng, Hanjie Chen

This paper analyzes the failure of current embedding-based defenses in multi-agent LLM systems and proposes using token-level confidence scores (logits) for improved robustness.

View →