Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Yi Liu

Yi Liu

23 indexed papers

Recent (6 mo)
23
With code
0
Influential cites
0
Benchmarked
0

Publications per year

23
26

Top categories

AI×18Crypto×11NLP×9ML×4Vision×3Info Retrieval×2Software Eng.×2Game Theory×1

Frequent co-authors

Gelei Deng8×
Yuekang Li8×
Ying Zhang7×
Leo Yu Zhang7×
Yubin Qu4×
Yanjun Zhang4×

Research Timeline

2026
Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing

This paper provides the first comprehensive systematization and large-scale empirical evaluation of existing LLM-based Automated Penetration Testing (AutoPT) frameworks, offering a structured taxonomy and unified benchmark for the field.

Membership Inference Attacks Against Video Large Language Models

This paper presents a black-box membership inference attack (MIA) against Video Large Language Models (VideoLLMs), demonstrating that they are vulnerable by analyzing generation behavior across varying decoding temperatures.

Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks

The paper introduces OverEager-Gen, a new benchmark that measures 'overeager actions'—where coding agents perform unauthorized tasks beyond a benign request—and finds that removing explicit consent declarations significantly increases this overeager behavior across multiple agents.

RADAR: Defending RAG Dynamically against Retrieval Corruption

The paper proposes RADAR, a novel graph-based framework that dynamically defends Retrieval-Augmented Generation (RAG) systems against evolving adversarial attacks while minimizing storage overhead.

Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback

The paper advocates for integrating explicit contextual feedback (like reviews and comments) into LLM-based recommender systems to achieve more personalized, transparent, and semantically aligned recommendations.

SSR3D-LLM: Structured Spatial Reasoning via Latent Steps for Fine-Grained Grounding in Unified 3D-LLMs

SSR3D-LLM introduces a structured spatial reasoning interface for unified 3D-LLMs, allowing fine-grained object grounding by generating and processing sequential latent spatial steps.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current GUI agents.

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning

VCap introduces a novel Witness-Adjudicator reward mechanism that provides highly precise, factually grounded feedback for visual captioning, enabling state-of-the-art performance in RL-trained multimodal models.

A Unified Framework for the Evaluation of LLM Agentic Capabilities

The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and environment.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive benchmarking pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by embedding malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current VLM-driven GUI agents.

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate social-alignment requirements.

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

LoopFM proposes a novel framework to significantly improve knowledge distillation for recommendation systems by structuring the rich intermediate embeddings of large foundation models as input features, thereby overcoming the limitations of single-scalar prediction transfer.

A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

The paper introduces a distribution-free statistical framework that allows existing rewrite-based detectors to achieve finite-sample False Discovery Rate (FDR) guarantees for detecting LLM-generated text without requiring model retraining.

DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning

The paper proposes DARTS, a distribution-aware active rollout trajectory shaping method that fundamentally accelerates LLM reinforcement learning by actively shaping the long-tail response distribution towards conciseness and certainty.

CamGeo: Sparse Camera-Conditioned Image-to-Video Generation with 3D Geometry Priors

CamGeo is a novel framework that improves sparse camera-conditioned image-to-video generation by distilling rich 3D geometric priors into the diffusion backbone, resulting in geometrically consistent motion.

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step reasoning.

Hardness of Approximate Hylland-Zeckhauser Equilibria

The paper establishes that finding approximate Hylland-Zeckhauser equilibria (a type of market allocation) is computationally hard, specifically showing it is PPAD-hard under certain complexity assumptions.

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

OmniOPD introduces a logit-free, chunk-level distillation framework that improves on standard On-Policy Distillation by using semantic similarity and peak-entropy scheduling, achieving state-of-the-art performance even with black-box teachers.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentMay 31, 2026

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…

View →
cs.GTcs.CCRecentMay 31, 2026

Hardness of Approximate Hylland-Zeckhauser Equilibria

Mark Braverman, Jingyi Liu, Eric Xue, Chenghan Zhou

The paper establishes that finding approximate Hylland-Zeckhauser equilibria (a type of market allocation) is computationally hard, specifically showing it is PPAD-hard under certain complexity assump…

View →
cs.LGcs.CLRecentMay 31, 2026

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Yuhang Zhou, Lizhu Zhang, Yifan Wu, Mingyi Wang +4 more

OmniOPD introduces a logit-free, chunk-level distillation framework that improves on standard On-Policy Distillation by using semantic similarity and peak-entropy scheduling, achieving state-of-the-ar…

View →
stat.MEcs.AIstat.APRecentMay 29, 2026

A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

Yi Liu

The paper introduces a distribution-free statistical framework that allows existing rewrite-based detectors to achieve finite-sample False Discovery Rate (FDR) guarantees for detecting LLM-generated t…

View →
cs.LGcs.AIRecentMay 29, 2026

DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning

Yujie Wang, Siwei Chen, Longzan Luo, Xinyi Liu +3 more

The paper proposes DARTS, a distribution-aware active rollout trajectory shaping method that fundamentally accelerates LLM reinforcement learning by actively shaping the long-tail response distributio…

View →
cs.CERecentMay 29, 2026

CamGeo: Sparse Camera-Conditioned Image-to-Video Generation with 3D Geometry Priors

Xuanyi Liu, Deyi Ji, Liqun Liu, Lanyun Zhu +7 more

CamGeo is a novel framework that improves sparse camera-conditioned image-to-video generation by distilling rich 3D geometric priors into the diffusion backbone, resulting in geometrically consistent…

View →
cs.CLcs.AIcs.HCRecentMay 28, 2026

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

Jun Rui Huang, Wang Bill Zhu, Ziyi Liu, Nathanael Fast +2 more

The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate…

View →
cs.LGcs.AIcs.IRRecentMay 28, 2026

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

Shali Jiang, Hua Zheng, Boyang Liu, Laming Chen +39 more

LoopFM proposes a novel framework to significantly improve knowledge distillation for recommendation systems by structuring the rich intermediate embeddings of large foundation models as input feature…

View →
cs.IRcs.AIRecentMay 27, 2026

Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback

Weizhi Zhang, Wooseong Yang, Yuxin Cui, Zhaohui Guo +8 more

The paper advocates for integrating explicit contextual feedback (like reviews and comments) into LLM-based recommender systems to achieve more personalized, transparent, and semantically aligned reco…

View →
cs.CVcs.AIRecentMay 27, 2026

SSR3D-LLM: Structured Spatial Reasoning via Latent Steps for Fine-Grained Grounding in Unified 3D-LLMs

Jiawei Li, Ziyi Liu, Weijie Shi, Long Chen +2 more

SSR3D-LLM introduces a structured spatial reasoning interface for unified 3D-LLMs, allowing fine-grained object grounding by generating and processing sequential latent spatial steps.

View →
cs.CRcs.AIcs.CLRecentMay 27, 2026

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

Yubin Qu, Yi Liu, Gelei Deng, Yanjun Zhang +3 more

The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variat…

View →
cs.CRcs.AIcs.CLRecentMay 27, 2026

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

Ruoqi Guo, Yi Liu, Gelei Deng, Yiheng Xiong +6 more

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully…

View →
cs.CVcs.AIcs.CLRecentMay 27, 2026

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning

Xingyu Lu, Jinpeng Wang, Yi-Fan Zhang, Yankai Yang +12 more

VCap introduces a novel Witness-Adjudicator reward mechanism that provides highly precise, factually grounded feedback for visual captioning, enabling state-of-the-art performance in RL-trained multim…

View →
cs.AIRecentMay 27, 2026

A Unified Framework for the Evaluation of LLM Agentic Capabilities

Pengyu Zhu, Lijun Li, Yaxing Lyu, Qianxin Luo +7 more

The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and…

View →
cs.CRcs.AIcs.CLRecentMay 27, 2026

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

Yubin Qu, Yi Liu, Gelei Deng, Yanjun Zhang +3 more

The paper introduces SNARE, a novel adaptive benchmarking pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the v…

View →
cs.CRcs.AIcs.CLRecentMay 27, 2026

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

Ruoqi Guo, Yi Liu, Gelei Deng, Yiheng Xiong +6 more

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by embedding malicious text into user-generated content regions of mobile screenshots, successfully…

View →
cs.CRcs.LGRecentMay 21, 2026

RADAR: Defending RAG Dynamically against Retrieval Corruption

Ziyuan Chen, Yueming Lyu, Yi Liu, Weixiang Han +3 more

The paper proposes RADAR, a novel graph-based framework that dynamically defends Retrieval-Augmented Generation (RAG) systems against evolving adversarial attacks while minimizing storage overhead.

View →
cs.SEcs.AIcs.CLRecentMay 18, 2026

Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks

Yubin Qu, Ying Zhang, Yanjun Zhang, Gelei Deng +3 more

The paper introduces OverEager-Gen, a new benchmark that measures 'overeager actions'—where coding agents perform unauthorized tasks beyond a benign request—and finds that removing explicit consent de…

View →
cs.CRRecentApr 29, 2026

Membership Inference Attacks Against Video Large Language Models

Wei Song, Yuxin Cao, Ziqi Ding, Yi Liu +2 more

This paper presents a black-box membership inference attack (MIA) against Video Large Language Models (VideoLLMs), demonstrating that they are vulnerable by analyzing generation behavior across varyin…

View →
cs.CRcs.AIcs.SERecentApr 7, 2026

Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing

Jiaren Peng, Zeqin Li, Chang You, Yan Wang +16 more

This paper provides the first comprehensive systematization and large-scale empirical evaluation of existing LLM-based Automated Penetration Testing (AutoPT) frameworks, offering a structured taxonomy…

View →