Yimin

46 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×32Crypto×16ML×15Vision×9NLP×8Software Eng.×4Distributed×3Multimedia×3

Frequent co-authors

Yiming Zhang7×

Yiming Li5×

Yiming Liu4×

Yiming Wang3×

Wei Zhou2×

Koji Tsuda2×

Research Timeline

2026

Implicit Drifting Policy: One-Step Action Generation via Conditional Expert Geometry

The Implicit Drifting Policy (IDP) is a novel one-step action generation framework that implicitly enforces trajectory correction constraints by analyzing local expert action geometry, overcoming the difficulties of explicitly estimating a training-time drifting field.

Med-HEAL: Analyzing and Mitigating Hallucinations in Medical LLMs with Hallucination-Aware In-Context Learning

The paper introduces Med-HEAL, a comprehensive framework and dataset for systematically identifying and mitigating hallucinations in medical LLMs, demonstrating that a self-critique pipeline significantly improves model accuracy.

SimSD: Simple Speculative Decoding in Diffusion Language Models

The paper proposes SimSD, a plug-and-play speculative decoding algorithm that adapts diffusion language models (dLLMs) to achieve fast, token-level acceleration by restoring causal masking capabilities.

Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization

The paper proposes FLAME, a novel framework that detects AI-generated image forgeries by identifying intrinsic energy anomalies caused by the diffusion process, achieving state-of-the-art localization.

CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback

The paper proposes Credit-Attenuated Privileged Feedback (CAPF), a training-time mechanism that uses verifier-side information to guide LLM search agents, significantly improving their performance on complex QA tasks.

Token Predictors Are Not Planners: Building Physically Grounded Causal Reasoners

The paper argues that current embodied planning benchmarks prioritize superficial language prediction over true physical reasoning, introducing new benchmarks and a large-scale dataset to demonstrate that physically grounded causal reasoning is necessary for reliable autonomous agents.

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

This paper introduces Imaginative Perception Tokens (IPT) to improve spatial reasoning in vision language models.

ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues

The paper introduces ReproRepo, a scalable framework for evaluating the reproducibility of machine learning research using LLM agents and human-raised GitHub issues.

Accelerating Disaggregated RL for Visual Generative LLMs with Diffusion-Based Parallelism and Trainer-Assisted Generation

DigenRL is a disaggregated RL framework for diffusion-based generative LLMs that achieves 1.56-2.10x throughput improvements over state-of-the-art diffusion RL systems.

DiStash: A Disaggregated Multi-Stash Transactional Key-Value Store

This paper introduces DiStash, a disaggregated transactional key-value store that enables an application to use a single transaction to manage key-value pairs across different pools of stashes, preventing race conditions and data loss.

Rethinking Speech-LLM Integration for ASR: Effective Joint Speech-Text Training by Interleaving

This paper proposes Joint Speech-Text Interleaved Pretraining (JSTIP) for speech recognition, which constructs interleaved speech-text sequences and achieves consistent entity accuracy improvement.

MedPMC: A Systematic Framework for Scaling High-Fidelity Medical Multimodal Data for Foundation Models

The paper introduces MedPMC, a framework that transforms permissively licensed literature into high-fidelity infrastructure for medical multimodal models, resulting in improved performance on various benchmarks.

DiPhon: Diffusion on Graphons for Scalable Graph Generation

This paper introduces DiPhon, a diffusion framework for size-scalable graph generation, using a continuous diffusion process on the graphon space and a discretized graph-level process.

Native Video-Action Pretraining for Generalizable Robot Control

This paper introduces LingBot-VA 2.0, a video-action foundation model designed for embodiment, with semantic visual-action tokenization, causal pretraining, sparse MoE backbone, and enhanced asynchronous inference.

PS4: Proxy-Supervised Joint Training for Real Target Speaker Extraction

The paper introduces PS4, a framework for training target speaker extraction models using a large-scale corpus and proxy-supervised joint training strategy.

Scalable Visual Pretraining for Language Intelligence

This paper presents the benefits of visual pretraining for foundation model intelligence, outperforming text-only pretraining on multiple backbones and benchmarks.

SAGA: Schema-Aware Grounding for Agentic Text-to-SPARQL Generation

The paper proposes SAGA, a framework for schema-aware grounding in agentic text-to-SPARQL generation, which maintains a persistent type state, filters incompatible property candidates, and handles missing schema information permissively.

Scaling Unmodified Multithreaded Applications with Elastic CXL-based Distributed Shared Memory

xDSM is a full-space, elastic DSM system built over CXL that transparently scales unmodified multithreaded applications by employing an OS-runtime co-design, dynamic data placement policy, and spatial locality-aware elasticity.

Climate-resilient electric vehicle charging infrastructure for sustainable cities: An interpretable causal-ensemble framework for preventive maintenance and low-carbon mobility

This paper develops FGDSE, a feature-governed dynamic stacking ensemble for climate-resilient charging-asset management in electric vehicles, which predicts daily fault risk over a multi-week horizon and identifies extreme heat as a causal factor for heat-sensitive posts.

Specula: Scaling formal specifications for autonomous model checking of system code

Specula is an autonomous system that generates high-quality formal specifications for large, complex code using LLMs, improving understanding and finding bugs.

Highlighted terms show continued research focus across papers

Papers

cs.SEcs.AIcs.DCNEWEmpiricalJul 28, 2026

Specula: Scaling formal specifications for autonomous model checking of system code

Qian Cheng, Saad Mohammad Rafid Pial, Ruize Tang, Yiming Su +5 more

Specula is an autonomous system that generates high-quality formal specifications for large, complex code using LLMs, improving understanding and finding bugs.

View →

math.OCcs.LG