Qi Li

46 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×21AI×17Vision×9ML×6NLP×6Robotics×4Sound×3Info Retrieval×3

Frequent co-authors

Xiaoqi Li5×

Jiaqi Li4×

Ke Xu4×

Wenkai Li4×

Zongwei Li4×

Jiaqi Liu3×

Research Timeline

2026

A Unified and Reproducible Experimentation Framework for Speech Understanding

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

I-WebGenBench : Evaluating Interactivity in LLM-Generated Scientific Web Applications

The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechanisms.

Sandboxed Coding Agents are Competitive Omni-modal Task Solvers

The paper demonstrates that specialized coding agents, using only text and image access within a sandbox, can effectively solve complex omnimodal tasks, often outperforming state-of-the-art native omnimodal models.

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgetting and enhancing robustness.

Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition

The paper proposes Self-Adaptive Monotonic Normalization (SAMN), a hyperparameter-friendly method that improves long-tailed recognition by enforcing monotonicity on per-class weight norms without requiring parameter regularization.

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validation and submission.

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

RedEdit: Agentic Red-Teaming of Image Safety Classifiers via MCTS-Guided Photo-Editing

The paper introduces RedEdit, an agentic red-teaming framework that demonstrates that malicious images can be easily edited to bypass safety classifiers while retaining their harmful semantics.

Looped World Models

Introduces Looped World Models, a looped architecture for world modelling that iteratively refines latent environment states for up to 100x parameter efficiency.

Toward Calibrated Mixture-of-Experts Under Distribution Shift

This paper studies the behavior of mixture-of-experts (MoE) models under distribution shift and proposes an adversarial reweighting method to improve their calibration.

Learning Action Priors for Cross-embodiment Robot Manipulation

This paper proposes a two-stage training framework to pretrain action modules with motion priors before Vision-Language-Action (VLA) alignment, improving VLA performance and reducing optimization challenges.

From Bootstrapping to Sequence Modeling: A Unified Generative Framework for Personalized Landing-Page Modeling

This paper proposes GLAN, a sequence modeling framework for Personalized Landing Page Modeling on online platforms, addressing the limitations of previous reinforcement learning approaches.

ShopX: A Foundation Model for Intent-to-Item Fulfillment in Agentic Shopping

This paper proposes ShopX, a model-centric framework for intent-driven shopping experiences using a single foundation model for intent understanding, execution planning, and item-space operations.

SABLE: An NDA-Safe Closed-Loop LLM Framework for Analog Circuit Optimization in Industrial EDA Flows

The paper presents SABLE, an NDA-safe framework that allows large language models to optimize analog circuits in industrial EDA tools while protecting proprietary information.

Scaling Mixture-of-Experts Video Pretraining for Embodied Intelligence

This paper introduces LingBot-Video, a video pretraining paradigm for embodied intelligence using a DiT-based approach, Mixture-of-Experts framework, and extensive robot-oriented data.

Native Video-Action Pretraining for Generalizable Robot Control

This paper introduces LingBot-VA 2.0, a video-action foundation model designed for embodiment, with semantic visual-action tokenization, causal pretraining, sparse MoE backbone, and enhanced asynchronous inference.

BadWAM: When World-Action Models Dream Right but Act Wrong

This paper introduces BadWAM, a framework for modeling and evaluating World-Action Drift Attacks, a new class of adversarial attacks that break the alignment between a World-Action Model's (WAM's) imagined future and its executed actions.

Toward Site-Aware MR Art Exhibitions: A SLAM-Based Deployment Pipeline for Spatial Coherence and Exhibition Experience

This paper presents a practical pipeline for designing and deploying large-scale Mixed Reality art exhibitions using SLAM-based alignment, and evaluates its impact on technical stability and user experience.

SimulS2ST-Omni: Data-Efficient Streaming Speech-to-Speech Translation via Explicit Trajectory Supervision

This paper introduces a training recipe for sentence-level and long-form streaming speech-to-speech translation using only 2k hours of paired cross-lingual data and auxiliary supervision.

Application-Driven Architecture Exploration for Cross-Layer Heterogeneous Systems

The paper presents CHASE, an application-driven framework that explores physically feasible Cross-layer Heterogeneous System architectures for executing workloads with diverse requirements.

Highlighted terms show continued research focus across papers

Papers

cs.DCEmpiricalRecentJul 25, 2026

Application-Driven Architecture Exploration for Cross-Layer Heterogeneous Systems

Yuchen Fan, Minghong Sun, Jikui Ma, Yunpeng Xu +18 more

The paper presents CHASE, an application-driven framework that explores physically feasible Cross-layer Heterogeneous System architectures for executing workloads with diverse requirements.

View →

cs.SDEmpiricalRecent