Qi Gu

9 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×6NLP×4ML×3Crypto×3Stats ML×1Stats Theory×1Stats Comp.×1Vision×1

Frequent co-authors

Jiaqi Guo3×

Ruoqi Guo2×

Yi Liu2×

Gelei Deng2×

Yiheng Xiong2×

Yuekang Li2×

Research Timeline

2026

LinuxArena: A Control Setting for AI Agents in Live Production Software Environments

The paper introduces LinuxArena, a large-scale, diverse control setting for testing AI agents in live production environments, demonstrating its utility for evaluating both attack and defense mechanisms.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current GUI agents.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by embedding malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current VLM-driven GUI agents.

KairosAgent: Agentic Time Series Forecasting with Fused Semantic Reasoning

KairosAgent is a novel agentic framework that combines Large Language Models (LLMs) for semantic reasoning and Time Series Foundation Models (TSFMs) for numerical forecasting, achieving superior multimodal time series prediction.

A Unified and Reproducible Experimentation Framework for Speech Understanding

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft

The paper introduces MineExplorer, a new benchmark in Minecraft, to evaluate the sustained open-world exploration capabilities of MLLM agents, finding that long-horizon coordination remains a significant challenge.

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

The paper introduces 3DCodeBench, a systematic benchmark and platform for evaluating Vision-Language Model (VLM) agents' ability to generate procedural 3D models from text and images using code.

Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill

The paper proposes Skill-RM, a unified framework that treats reward modeling as an agentic task to consistently integrate diverse evaluation criteria, achieving superior performance over traditional methods.

Bayesian learning for the stochastic shortest path problem

The paper proposes a novel Bayesian framework to learn the optimal decision strategy for the stochastic shortest path problem by directly constructing the posterior beliefs for the action-value function $Q^*$ using Bellman's optimality equations.

Highlighted terms show continued research focus across papers

Papers

stat.MLcs.LGmath.STRecentJun 3, 2026

Bayesian learning for the stochastic shortest path problem

Chon Wai Ho, Sumeetpal S. Singh, Jiaqi Guo

View →

cs.LGcs.CLRecentJun 2, 2026