Ming Li

33 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×22Crypto×12NLP×7ML×6Vision×5Multimedia×2Info Retrieval×1Stats ML×1

Frequent co-authors

Yiming Liu4×

Ming Liu4×

Tong Yang4×

Yiming Li4×

Ziming Li3×

Yaoming Li3×

Research Timeline

2026

LiveBrowseComp: Are Search Agents Searching, or Just Verifying What They Already Know?

The paper argues that current search agents often verify existing knowledge rather than genuinely searching, and introduces LiveBrowseComp, a new benchmark to measure true evidence-driven discovery.

Reinforcement Learning with Robust Rubric Rewards

The paper introduces $ ext{RLR}^3$, a novel framework that extends verifiable rewards in Reinforcement Learning to handle partially verifiable, multi-criteria vision-language tasks by integrating robust rubric scoring.

BORA: Bridging Offline Reinforcement Learning and Online Residual Adaptation for Real-World Dexterous VLA Models

BORA is an offline-to-online RL framework that enhances dexterous VLA models for real-world robotics by using an action-conditioned critic and a lightweight residual adaptation mechanism to correct execution errors.

Compass: Navigating Global Marine Lead Data Integration through Expert-Guided LLM Agent

The paper introduces Compass, an expert-guided LLM agent framework that successfully extracts and integrates thousands of previously inaccessible marine lead records from vast corpora of scientific papers, creating a major new global database.

ESPO: Early-Stopping Proximal Policy Optimization

ESPO is a novel reinforcement learning algorithm that detects trajectory failure in large language models and terminates rollouts early, significantly improving performance on mathematical reasoning benchmarks while reducing computational cost.

ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression

ConMoE proposes a train-free method for compressing Mixture-of-Experts (MoE) models by consolidating the large expert pool into a smaller set of reusable prototypes and deterministically remapping all original expert calls to these prototypes.

Counterfactual Graph for Multi-Agent LLM Calibration

The paper proposes CAGE-CAL, a counterfactual graph calibration framework, to accurately assess the reliability and detect over-confidence in multi-agent LLM systems after agents communicate.

Cert-LAS: Toward Certified Model Ownership Verification for Text-to-Image Diffusion Models via Layer-Adaptive Smoothing

The paper proposes Cert-LAS, a novel certified method for verifying model ownership in text-to-image diffusion models, which is robust against malicious signal removal attacks.

StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning

The paper introduces StemBind, a diagnostic benchmark that separates perception, rule induction, and answer selection in abstract visual reasoning, revealing that the primary failure point for MLLMs is the mapping of the identified rule to the correct instance.

ProtStructQA: A Denotation Threshold in Protein Structural Reasoning

The paper introduces ProtStructQA, an executable benchmark that tests protein structural reasoning by requiring language models to generate measurable 3D coordinates, revealing a capability-dependent transition point where chain-of-thought reasoning surpasses tool-mediated approaches.

Med-HEAL: Analyzing and Mitigating Hallucinations in Medical LLMs with Hallucination-Aware In-Context Learning

The paper introduces Med-HEAL, a comprehensive framework and dataset for systematically identifying and mitigating hallucinations in medical LLMs, demonstrating that a self-critique pipeline significantly improves model accuracy.

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgetting and enhancing robustness.

AdaCodec: A Predictive Visual Code for Video MLLMs

AdaCodec introduces a predictive visual coding scheme for video MLLMs, significantly improving efficiency and performance by transmitting only inter-frame changes and full reference frames when necessary.

Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization

The paper proposes FLAME, a novel framework that detects AI-generated image forgeries by identifying intrinsic energy anomalies caused by the diffusion process, achieving state-of-the-art localization.

A Primer in Post-Training Reasoning Data: What We Know About How It Works

This paper synthesizes over 150 scattered studies and reports to provide the first comprehensive primer on post-training reasoning data, organizing the field around data objects, utility, construction, and scalability.

CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback

The paper proposes Credit-Attenuated Privileged Feedback (CAPF), a training-time mechanism that uses verifier-side information to guide LLM search agents, significantly improving their performance on complex QA tasks.

Token Predictors Are Not Planners: Building Physically Grounded Causal Reasoners

The paper argues that current embodied planning benchmarks prioritize superficial language prediction over true physical reasoning, introducing new benchmarks and a large-scale dataset to demonstrate that physically grounded causal reasoning is necessary for reliable autonomous agents.

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

This paper introduces Imaginative Perception Tokens (IPT) to improve spatial reasoning in vision language models.

A Quantitative Approximation Framework for Flow Distillation in Diffusion Models

The paper develops a quantitative framework to analyze and improve flow distillation in diffusion models, providing stability guarantees and suggesting non-uniform time scheduling to reduce approximation errors.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Highlighted terms show continued research focus across papers

Papers

cs.IRcs.AIcs.CLRecentJun 4, 2026

OneReason Technical Report

OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more

View →

cs.AIRecentJun 2, 2026