Yang Li

50 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×32Crypto×26NLP×9ML×8Vision×8Software Eng.×7Robotics×3Info Retrieval×2

Frequent co-authors

Yang Liu18×

Jing Chen3×

Yilong Yang3×

Zhuo Ma3×

Yebo Feng3×

Cong Wu3×

Research Timeline

2026

BAGEN: Are LLM Agents Budget-Aware?

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval estimation significantly improves efficiency.

CAST: Non-Privileged Clipped Asymmetric Self-Teaching with Advantage Flipping for GRPO

The paper proposes CAST, an answer-free self-distillation method that enhances Group Relative Policy Optimization (GRPO) for verifiable rewards, allowing token-level advantage signals even when all sampled trajectories are uniformly correct or incorrect.

SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition

SpecDB is a novel system that uses LLMs to synthesize highly customized, purpose-built relational databases, achieving performance comparable to commercial systems while significantly reducing code size.

PaCo-VLA: Passivity-Shielded Compliance Prior for Contact-Rich Vision-Language-Action Manipulation

PaCo-VLA introduces a passivity-shielded compliance prior to safely bridge the gap between high-level Vision-Language-Action (VLA) semantic outputs and low-level, force-sensitive robotic control.

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly enhances this personalized empathy.

LaSR: Context-Aware Speech Recognition via Latent Reasoning

The paper proposes LaSR, a context-aware training paradigm that uses latent reasoning to significantly improve speech recognition, especially for specialized terminology, without adding latency.

Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical Memory

The paper introduces MAAD, a multi-agent framework that autonomously transforms software requirements into comprehensive, multi-view architectural blueprints, significantly improving completeness and reducing manual validation.

SIRIUS-SQL: Anchoring Multi-Candidate Text-to-SQL in Execution Feedback

SIRIUS-SQL introduces a robust multi-candidate text-to-SQL system that addresses weaknesses in candidate generation, error handling, and selection, achieving state-of-the-art performance on complex benchmarks.

Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs

The paper introduces APEIRIA, a neuro-symbolic 3D Multi-modal LLM that bridges the gap between interpretable symbolic reasoning and flexible, open-vocabulary 3D understanding.

Unlocking the Black Box of Latent Reasoning: An Interpretability-Guided Approach to Intervention

This paper introduces interpretability-guided, training-free interventions that systematically improve the accuracy and controllability of latent reasoning in LLMs by leveraging structural and causal insights into continuous hidden states.

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly improving performance on complex tasks without external skill generators.

RASER: Recoverability-Aware Selective Escalation Router for Multi-Hop Question Answering

RASER introduces a family of cheap, router-based systems that selectively decide whether to perform expensive multi-hop retrieval, significantly reducing LLM token costs while maintaining state-of-the-art performance.

Training-Free Composed Video Retrieval via Visual Representation-Guided Video-LLM Reasoning

The paper proposes a training-free framework, Visual Representation-Guided Video-LLM Reasoning, to perform composed video retrieval by using visual examples and text instructions, achieving strong performance on the CVPR 2026 challenge.

Community-Aware Assessment of Social Textual Engagement and Resonance: A Human-Centric Perspective on User-Generated Content Evaluation

The paper introduces CASTER, a new human-centric task for evaluating User-Generated Content (UGC) resonance, and proposes MEDEA, an architecture that uses a Social Chain-of-Thought mechanism to simulate community reactions for quality assessment.

Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark

The paper demonstrates that explicit gender cues systematically affect LLM value trade-offs, causing decision flips that are often masked or misattributed by the models themselves.

Outsmarting the Chameleon: Counterfactual Decoupling for Tactical OOD Shifts in Live Streaming Risk Assessment

The paper proposes a novel framework, LPCD, that uses latent causal modeling to robustly assess evolving adversarial risks in live streaming by decoupling malicious intent from superficial tactical shifts.

Benign Inputs, Harmful Outputs: Cross-Modal Jailbreaking via Distributed Semantic Recomposition

The paper introduces Distributed Semantic Recomposition (DSR), a novel cross-modal jailbreaking framework that bypasses existing safety filters by decomposing harmful intent into benign input components, achieving high attack success rates with low input toxicity.

NeuroArmor: Safe-Variant-Guided Representation Consistency for Selective Re-Anchoring in Jailbreak Defense

NeuroArmor is a white-box runtime defense that uses prompt-specific safe variants to selectively detect and mitigate jailbreak attacks, significantly reducing attack success rates while maintaining a low false positive rate.

HORIZON: Recoverability-Governed Curriculum for Physical-Domain Scaling

This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.

Regret Minimization with Adaptive Opponents in Repeated Games

This paper introduces Repeated Policy Regret (RP-Regret), a novel game-theoretic metric for analyzing regret in repeated games with adaptive opponents, and proposes algorithms to minimize it.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.GTRecentJun 4, 2026

Regret Minimization with Adaptive Opponents in Repeated Games

Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang

This paper introduces Repeated Policy Regret (RP-Regret), a novel game-theoretic metric for analyzing regret in repeated games with adaptive opponents, and proposes algorithms to minimize it.

View →

cs.RORecentJun 3, 2026