~ similar to 2605.30152· 20 results
The paper introduces a systematic framework and defense mechanisms to analyze and mitigate autonomous LLM agent worms that propagate through persistent agent state and cross-platform multi-agent syste…
Agent libOS introduces a library-OS-inspired runtime substrate that treats LLM agents as schedulable processes, providing explicit capability control and robust auditing for long-running, stateful age…
The paper evaluates Language Model Agents (LMAs) for red-teaming by benchmarking their ability to perform lateral movement, finding that expert-defined action plans are most effective, though all moda…
The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…
The paper introduces a kill-chain canary methodology to diagnose prompt injection vulnerabilities across multi-stage LLM pipelines, revealing that write-node placement and document format are critical…
Julien Piet, Annabella Chow, Yiwei Hou, Muxi Lyu +4 more
The paper argues that web agents should abandon the reactive ReAct paradigm in favor of a plan-then-execute approach, which requires developing typed, task-level APIs to properly structure web interac…
Junlong Tong, Yao Zhang, Anhao Zhao, Yingqi Fan +2 more
ProactiveLLM introduces a novel framework that enables streaming LLMs to actively decide when to interact with incoming data by leveraging the model's internal states, significantly reducing latency w…
Lu Yi, Runlin Lei, Liuyi Yao, Yuexiang Xie +5 more
The paper introduces Adaptive Context Management (AdaCoM), an external context manager that uses reinforcement learning to improve the performance of frozen LLM agents on long-horizon tasks by intelli…
Wei Zheng, Yang Yan, Yiyang Shao, Jinyang Li +5 more
The paper proposes A2X, an LLM-native progressive-disclosure scheme that structures service taxonomies hierarchically and searches them layer-by-layer at query time, solving context overflow and impro…
The paper introduces NeuroTaint, a novel taint tracking framework that adapts information flow analysis for LLM agents by modeling taint propagation as semantic transformation and causal influence, si…
Yongxiang Li, Moxin Li, Zhixin Ma, Fengbin Zhu +3 more
This paper introduces the concept of 'Sleeper Attack,' demonstrating that adversarial content can persist across multiple interactions with an LLM agent, posing a more subtle and difficult-to-detect s…
Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu +2 more
The paper introduces CAESAR, a novel multi-agent framework that coordinates LLM agents across five specialized roles to improve success rates and stability in complex, multi-stage cyber intrusion task…
The paper investigates how LLM agents determine the security of their execution environment in a simulated negotiation setting, finding that while they can detect danger, they cannot reliably verify s…
Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more
The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…
Xixun Lin, Yang Liu, Yancheng Chen, Yongxuan Wu +7 more
The paper introduces SafeHarness, a novel, lifecycle-integrated security architecture that significantly reduces unsafe behavior and attack success rates in LLM agents by weaving multiple defense laye…
Kaustav Kundu, Ritvik Shrivastava, Maxim Arap, Nanshu Wang +12 more
This paper introduces a proactive multi-modal assistant system and a large-scale dataset for procedural assistance.
The paper demonstrates a semantic denial-of-service attack against LLM-controlled robots by injecting short, safety-plausible phrases into the audio channel, causing the robot to halt or disrupt execu…
Md Nakhla Rafi, Md Ahasanuzzaman, Dong Jae Kim, Zhijie Wang +1 more
FALAT is a diagnostic framework that treats failure attribution in complex LLM agent trajectories as a dependency-guided search problem, successfully identifying both the responsible agent and the dec…
The paper demonstrates that the order and content of external information (the 'feed') an LLM agent consumes before making a decision can significantly and causally steer its final choice, often overr…
The paper demonstrates that the sequence and composition of external information (the 'feed') an LLM agent consumes can significantly and causally steer its final decisions, often overriding its defau…