20 results for “GUI agent, visual memory, failure, taxonomy, AGMem”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
Seoyoung Choi, Minseok Ko, Hyunseok Lee, Kunwoong Kim +3 more
This paper introduces a taxonomy of GUI agent failures and finds that full-image memory has divergent effects on failure distribution. It proposes Action-Grounded Visual Memory (AGMem) as an effective…
The paper introduces memorywire, a vendor-neutral JSON-Schema wire format and reference implementation designed to standardize and govern memory operations across disparate agent-memory frameworks.
Qingshan Liu, Guoqing Wang, Wen Wu, Jingqi Huang +4 more
MemPro introduces a system-level evolution framework that treats the entire memory construction-retrieval pipeline as an evolvable program, significantly improving long-horizon agent performance over…
The paper introduces memorywire, a vendor-neutral JSON-Schema 2020-12 wire format and reference implementation to standardize and govern agent memory operations across diverse, proprietary agent-memor…
Jizhan Fang, Buqiang Xu, Zhixian Wang, Haoliang Cao +11 more
The paper proposes FluxMem, a novel connectivity-evolving memory framework that models memory as a dynamic graph to improve LLM agent performance in complex, changing environments.
The paper introduces MemCog, a Memory-as-Cognition system that integrates memory access directly into the reasoning process, significantly improving agent performance, especially in proactive memory r…
This paper analyzes failure modes in collaborative visual reasoning systems, demonstrating that naive shared workspaces can amplify hallucinations and proposing diagnostics for improving communication…
Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more
This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.
The paper identifies and measures a critical failure mode where LLM agents violate policies by losing or corrupting directive-bearing state during the process of assembling the decision context, and p…
Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more
MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…
Xiaoyi Chen, Yifei Gao, Yang Xu, Xingxing Song +2 more
The paper introduces GUITestScape, a comprehensive benchmark for exploratory GUI testing, and GUIJudge, an open-set evaluator that significantly improves the assessment of AI agents' defect detection…
This paper analyzes memory poisoning attacks targeting multi-agent systems (MAS) powered by LLMs, proposing mitigation strategies across various memory types, especially focusing on secure design prin…
The paper introduces POIROT, a novel protocol that uses the agents within a multi-agent system itself to diagnose and detect failures, demonstrating superior performance over traditional evaluation me…
Xianyou Li, Weiran Yan, Yichao Wu, Penghao Liang +3 more
This paper introduces a failure-aware observability framework to diagnose wasted computation in multi-agent LLM systems by mapping recurring failure modes to online trace signals.
Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu +14 more
The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up t…
This survey establishes persistent, writable memory as an independent security problem for LLM agents, proposing a comprehensive framework for 'mnemonic sovereignty' to govern the entire memory lifecy…
Nahyun Lee, Dongkeun Yoon, Guijin Son, Geewook Kim +11 more
The paper introduces K-BrowseComp, a new web-browsing agent benchmark of 400 problems grounded in Korean contexts, demonstrating that current frontier LLMs struggle significantly with complex, context…
Tao Feng, Chongrui Ye, Tianyang Luo, Jingjun Xu +4 more
ElasticMem introduces a novel framework that treats memory as an elastic latent resource, allowing LLM agents to adaptively manage and inject variable-budget memories for improved performance in long-…
The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…
The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…