Papers similar to 2605.29146

~ similar to 2605.29146· 19 results

cs.AIRecentMay 27, 2026

SafeMed-R1: Clinician-Audited Safety and Ethics Alignment for Medical Large Language Models

Chao Ding, Mouxiao Bian, Tianbin Li, Minjia Yuan +11 more

The paper introduces SafeMed-R1, a clinically audited LLM that significantly improves safety and ethical alignment for medical applications, matching or exceeding resident performance on safety-critic…

View →

cs.CLRecentMay 31, 2026

DrugClaw and DrugAudit: A Primary-Source-Grounded Agent and Authority-Aware Benchmark for Drug-Information Question Answering

Qing Wang, Bo Li, Jialu Liang, Daling Shi +2 more

The paper introduces DrugClaw, a multi-agent system, and DrugAudit, a new benchmark, demonstrating that DrugClaw excels at answering drug-related questions by grounding answers in primary regulatory s…

View →

cs.AIRecentMay 30, 2026

TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety

Zhepei Hong, Lin Wang, Liting Li, Haokai Ma +4 more

The paper proposes TRACE, a trajectory risk-aware compression method, to effectively aggregate sparse and delayed safety evidence across long agent trajectories, achieving state-of-the-art performance…

View →

cs.CRcs.AIRecentMar 28, 2026

SafetyDrift: Predicting When AI Agents Cross the Line Before They Actually Do

Aditya Dhodapkar, Farhaan Pishori

The paper introduces SafetyDrift, a predictive model that forecasts when AI agents will violate safety protocols by analyzing the cumulative risk across sequences of individually safe actions.

View →

cs.AIcs.LGRecentMay 30, 2026

Medication-Aware Financial Exploitation Detection for Alzheimer's Patients Using Edge-Aware Interaction Risk Modeling

Farzana Akter, Lisan Al Amin, Rakib Hossain, Chaitanya Gunupudi +1 more

The paper proposes a medication-aware framework that integrates medication adherence with financial transaction monitoring to significantly improve the detection of financial exploitation in Alzheimer…

View →

cs.LGcs.AIcs.CRRecentJun 2, 2026

RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui +4 more

The paper introduces RUBAS, a rubric-based reinforcement learning framework that improves agent safety by providing fine-grained, multi-dimensional rewards for complex tool-use scenarios.

View →

cs.SEcs.CRRecentMar 18, 2026

Who Tests the Testers? Systematic Enumeration and Coverage Audit of LLM Agent Tool Call Safety

Xuan Chen, Lu Yan, Ruqi Zhang, Xiangyu Zhang

The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that exis…

View →

cs.CLRecentMay 31, 2026

UniD$^3$: A Knowledge Graph-Enhanced RAG Framework for Drug-Disease Discovery and Reasoning

Qing Wang, Tianshi Liu, Minghao Zhou, Jialu Liang +4 more

UniD$^3$ is a novel Knowledge Graph-enhanced RAG framework that processes vast biomedical literature to systematically extract, organize, and validate comprehensive drug-disease knowledge, achieving h…

View →

cs.CRRecentMar 18, 2026

The Verifier Tax: Horizon Dependent Safety Success Tradeoffs in Tool Using LLM Agents

Tanmay Sah, Vishal Srivastava, Dolly Sah, Kayden Jordan

The paper analyzes how runtime safety enforcement impacts the performance of multi-step LLM agents, finding that while safety mechanisms can block unsafe actions, they impose a significant performance…

View →

cs.AIRecentMay 31, 2026

CAREAgent: Clinical Agent with Structured Reasoning and Tool-Integrated for Order Generation

Ruihui Hou, Ziyue Huai, Chennuo Zhang, Ziyan Liu +4 more

CAREAgent is a novel agent designed for fine-grained clinical order generation, achieving significant performance improvements on unseen benchmarks by integrating structured reasoning and tool usage.

View →

cs.CLcs.AIRecentMay 28, 2026

SURGENT: A Surgical Multi-Agent Assistance System Across the Perioperative Workflow

Dongsheng Shi, Yue Li, Xin Yi, Yongyi Cui +2 more

The paper introduces SURGENT, a multi-agent assistance system designed for the entire perioperative workflow, which outperforms standard LLMs by providing context-aware, traceable, and privacy-preserv…

View →

cs.CRRecentMay 12, 2026

Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis

Zhenhao Xu, Wenhan Chang, Yichuan Chen, Yuxin Fang +2 more

The paper proposes Safety Context Injection (SCI), an inference-time framework that prepends a structured external risk report to protect Large Reasoning Models (LRMs) against sophisticated jailbreaks…

View →

cs.AIRecentMay 28, 2026

EHRBench: An Automated and Reliable EHR-based Benchmark for Clinical Decision Making with LLMs

Yuzhang Xie, Keqi Han, Yunpeng Xiao, Hejie Cui +6 more

The paper introduces EHRBench, a large-scale, automated, and reliable benchmark derived from real Electronic Health Records (EHRs) to rigorously evaluate the clinical decision-making capabilities of L…

View →

cs.CLRecentMay 31, 2026

Benchmarking Local LLMs for Natural-Language-to-SQL Querying in Biopharmaceutical Manufacturing: An Empirical Benchmark on Consumer-Grade Hardware

Sagar Bhetwal, Rajan Bastakoti, Nirajan Acharya, Gaurav Kumar Gupta

This study benchmarks four local LLMs for natural-language-to-SQL querying in biopharma manufacturing, finding that general-purpose code-tuned models like Llama 3.1 8B and Qwen 2.5 Coder 7B outperform…

View →

cs.CLcs.AIcs.CRRecentMay 28, 2026

Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents

Aditya Nawal, Manit Baser, Mohan Gurusamy

This paper introduces AgentREVEAL, a diagnostic framework showing that the utility of web retrieval in LLM agents creates a safety-utility trade-off, as relevance itself can degrade safety alignment a…

View →

cs.CLcs.AIcs.CRRecentMay 28, 2026

Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents

Aditya Nawal, Manit Baser, Mohan Gurusamy

This paper introduces AgentREVEAL, a diagnostic framework that demonstrates that the utility of web retrieval in LLM agents creates a safety-utility trade-off, as relevance itself can degrade safety a…

View →

cs.AIcs.CLcs.CRRecentMay 28, 2026

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.

View →

cs.AIcs.CLcs.CRRecentMay 28, 2026

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.

View →

cs.CRcs.AIcs.LGRecentMay 7, 2026

Research on Security Enhancement Methods for Adversarial Robust Large Language Model Intelligent Agents for Medical Decision-Making Tasks

Saisai Hu

The paper proposes ARSM-Agent, a full-link security enhancement framework, to significantly improve the adversarial robustness and security of large language model agents used for critical medical dec…

View →