Papers similar to 2606.14445

~ similar to 2606.14445· 20 results

cs.CRRecentMay 9, 2026

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions

Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu +2 more

The paper introduces CAESAR, a novel multi-agent framework that coordinates LLM agents across five specialized roles to improve success rates and stability in complex, multi-stage cyber intrusion task…

View →

cs.OScs.AIcs.CRRecentJun 2, 2026

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Yingqi Zhang

Agent libOS introduces a library-OS-inspired runtime substrate that treats LLM agents as schedulable processes, providing explicit capability control and robust auditing for long-running, stateful age…

View →

cs.SEcs.AIcs.MARecentMay 31, 2026

LLM Consortium for Software Design Refinement: A Controlled Experiment on Multi-Agent Collaboration Topologies

Nagarjuna Kanamarlapudi, Praveen K

The paper experimentally evaluates 12 multi-agent LLM collaboration topologies for software design, finding that structural adversarial prompting and cross-model review are the most effective approach…

View →

cs.CRcs.SERecentApr 5, 2026

LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more

The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…

View →

cs.CRRecentMay 8, 2026

Demystifying and Detecting Agentic Workflow Injection Vulnerabilities in GitHub Actions

Shenao Wang, Xinyi Hou, Zhao Liu, Yanjie Zhao +4 more

This paper introduces Agentic Workflow Injection (AWI), a new class of vulnerability in LLM-powered GitHub Actions, and presents TaintAWI, a novel taint-analysis tool that identifies hundreds of explo…

View →

cs.CRcs.AIcs.SERecentMay 11, 2026

Comment and Control: Hijacking Agentic Workflows via Context-Grounded Evolution

Neil Fendley, Zhengyu Liu, Aonan Guan, Jiacheng Zhong +1 more

The paper introduces JAW, a novel framework that demonstrates how adversaries can hijack agentic workflows on automation platforms like GitHub Actions by manipulating inputs based on context-grounded…

View →

cs.MAcs.AIRecentMay 28, 2026

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

Zhezheng Hao, Tianfu Wang, Huanshuo Dong, Ziyan Liu +6 more

The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improveme…

View →

cs.CRcs.AIcs.PLRecentMar 21, 2026

AC4A: Access Control for Agents

Reshabh K Sharma, Dan Grossman

The paper introduces AC4A, an access control framework that allows users to precisely limit the capabilities of LLM agents, ensuring they only access the specific APIs or parts of web pages necessary…

View →

cs.CRcs.MARecentApr 15, 2026

SoK: Security of Autonomous LLM Agents in Agentic Commerce

Qian'ang Mao, Jiaxin Wang, Ya Liu, Li Zhu +2 more

The paper develops a unified, cross-layer security framework for autonomous LLM agents operating in agentic commerce, identifying key attack vectors and proposing a layered defense architecture.

View →

cs.CRcs.AIcs.CLRecentMay 14, 2026

Web Agents Should Adopt the Plan-Then-Execute Paradigm

Julien Piet, Annabella Chow, Yiwei Hou, Muxi Lyu +4 more

The paper argues that web agents should abandon the reactive ReAct paradigm in favor of a plan-then-execute approach, which requires developing typed, task-level APIs to properly structure web interac…

View →

cs.SEcs.AIRecentJun 3, 2026

From Prompt to Process: a Process Taxonomy and Comparative Assessment of Frameworks Supporting AI Software Development Agents

Sanderson Oliveira de Macedo

This paper studies AI development frameworks for software engineering and proposes a six-dimension process taxonomy.

View →

cs.AIRecentMay 27, 2026

A Unified Framework for the Evaluation of LLM Agentic Capabilities

Pengyu Zhu, Lijun Li, Yaxing Lyu, Qianxin Luo +7 more

The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and…

View →

cs.SEcs.AIcs.CRRecentJun 2, 2026

Proof-Carrying Agent Actions: Model-Agnostic Runtime Governance for Heterogeneous Agent Systems

Zexun Wang

The paper proposes Proof-Carrying Agent Actions (PCAA), a runtime-neutral governance model that uses action certificates to consistently track and authorize high-risk actions across diverse and hetero…

View →

cs.CRRecentMay 4, 2026

Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense

Mingming Zha, Xiaofeng Wang

The paper introduces a systematic framework and defense mechanisms to analyze and mitigate autonomous LLM agent worms that propagate through persistent agent state and cross-platform multi-agent syste…

View →

cs.CRcs.AIcs.SERecentApr 22, 2026

Taint-Style Vulnerability Detection and Confirmation for Node.js Packages Using LLM Agent Reasoning

Ronghao Ni, Mihai Christodorescu, Limin Jia

The paper introduces LLMVD.js, a multi-stage LLM agent pipeline that effectively detects and confirms taint-style vulnerabilities in Node.js packages, achieving significantly higher confirmation rates…

View →

cs.CRRecentMay 3, 2026

AgenticVM: Agentic AI for Adaptive Software Vulnerability Management

Asrul Arifin, Hussain Ahmad, Yiyao Zhang, Diksha Goel

AgenticVM is a multi-agent framework that uses LLMs and specialized tools to automate and drastically reduce the volume of software vulnerabilities into actionable, prioritized queues.

View →

cs.AIRecentMay 27, 2026

Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows

Yilun Yao, Xinyu Tan, Chao-Hsuan Liu, Yaoming Li +8 more

The paper introduces Harness-Bench, a diagnostic benchmark that measures how different system 'harnesses' affect LLM agent performance in realistic workflows, showing that agent capability must be rep…

View →

cs.AIRecentJun 1, 2026

MOC: Multi-Order Communication in LLM-based Multi-Agent Systems

Yao Guan, Lin Wang, Zhihu Lu, Ziyi Wang +2 more

The paper proposes Multi-Order Communication (MOC) to overcome the limitations of standard first-order message passing in LLM-based multi-agent systems, significantly improving performance by capturin…

View →

cs.AIRecentMay 28, 2026

Indexing the Unreadable: LLM-Native Recursive Construction and Search of Service Taxonomies

Wei Zheng, Yang Yan, Yiyang Shao, Jinyang Li +5 more

The paper proposes A2X, an LLM-native progressive-disclosure scheme that structures service taxonomies hierarchically and searches them layer-by-layer at query time, solving context overflow and impro…

View →

cs.AIRecentMay 29, 2026

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

Hai Lin

The paper proposes the Intelligent Computing Architecture Model (ICAM), a six-layer framework that unifies disparate concepts in model-native computing by viewing the LLM stack through a dual-plane ar…

View →