Yi Liu

25 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×19Crypto×11NLP×9ML×5Vision×4HCI×2Info Retrieval×2Software Eng.×2

Frequent co-authors

Gelei Deng8×

Yuekang Li8×

Ying Zhang7×

Leo Yu Zhang7×

Yubin Qu4×

Yanjun Zhang4×

Research Timeline

2026

Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks

The paper introduces OverEager-Gen, a new benchmark that measures 'overeager actions'—where coding agents perform unauthorized tasks beyond a benign request—and finds that removing explicit consent declarations significantly increases this overeager behavior across multiple agents.

RADAR: Defending RAG Dynamically against Retrieval Corruption

The paper proposes RADAR, a novel graph-based framework that dynamically defends Retrieval-Augmented Generation (RAG) systems against evolving adversarial attacks while minimizing storage overhead.

Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback

The paper advocates for integrating explicit contextual feedback (like reviews and comments) into LLM-based recommender systems to achieve more personalized, transparent, and semantically aligned recommendations.

SSR3D-LLM: Structured Spatial Reasoning via Latent Steps for Fine-Grained Grounding in Unified 3D-LLMs

SSR3D-LLM introduces a structured spatial reasoning interface for unified 3D-LLMs, allowing fine-grained object grounding by generating and processing sequential latent spatial steps.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current GUI agents.

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning

VCap introduces a novel Witness-Adjudicator reward mechanism that provides highly precise, factually grounded feedback for visual captioning, enabling state-of-the-art performance in RL-trained multimodal models.

A Unified Framework for the Evaluation of LLM Agentic Capabilities

The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and environment.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive benchmarking pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by embedding malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current VLM-driven GUI agents.

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate social-alignment requirements.

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

LoopFM proposes a novel framework to significantly improve knowledge distillation for recommendation systems by structuring the rich intermediate embeddings of large foundation models as input features, thereby overcoming the limitations of single-scalar prediction transfer.

A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

The paper introduces a distribution-free statistical framework that allows existing rewrite-based detectors to achieve finite-sample False Discovery Rate (FDR) guarantees for detecting LLM-generated text without requiring model retraining.

DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning

The paper proposes DARTS, a distribution-aware active rollout trajectory shaping method that fundamentally accelerates LLM reinforcement learning by actively shaping the long-tail response distribution towards conciseness and certainty.

CamGeo: Sparse Camera-Conditioned Image-to-Video Generation with 3D Geometry Priors

CamGeo is a novel framework that improves sparse camera-conditioned image-to-video generation by distilling rich 3D geometric priors into the diffusion backbone, resulting in geometrically consistent motion.

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step reasoning.

Hardness of Approximate Hylland-Zeckhauser Equilibria

The paper establishes that finding approximate Hylland-Zeckhauser equilibria (a type of market allocation) is computationally hard, specifically showing it is PPAD-hard under certain complexity assumptions.

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

OmniOPD introduces a logit-free, chunk-level distillation framework that improves on standard On-Policy Distillation by using semantic similarity and peak-entropy scheduling, achieving state-of-the-art performance even with black-box teachers.

ElasticTTT: Prior-Preserving Test-Time Tuning for Video Editing

This paper introduces ElasticTTT, a framework to preserve the generative prior in Test-Time Tuning (TTT) of pretrained diffusion models for video editing, preventing Prior Collapse.

LatentFlow: Visual Analytics for Latent Space Analysis in Molecular Graph Neural Networks

The paper introduces LatentFlow, a system for analyzing latent spaces in molecular graph neural networks using clustering and visualization.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.HCEmpiricalRecentJul 24, 2026

LatentFlow: Visual Analytics for Latent Space Analysis in Molecular Graph Neural Networks

Shiyi Liu, Jiaqing Chen, Nicholas Hadler, Rostyslav Hnatyshyn +5 more

The paper introduces LatentFlow, a system for analyzing latent spaces in molecular graph neural networks using clustering and visualization.

View →

cs.CVcs.AIEmpirical