Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Yang Li

Yang Li

50 indexed papers

Recent (6 mo)
50
With code
0
Influential cites
0
Benchmarked
0

Publications per year

50
26

Top categories

AI×32Crypto×26NLP×9ML×8Vision×8Software Eng.×7Robotics×3Info Retrieval×2

Frequent co-authors

Yang Liu18×
Jing Chen3×
Yilong Yang3×
Zhuo Ma3×
Yebo Feng3×
Cong Wu3×

Research Timeline

2026
BAGEN: Are LLM Agents Budget-Aware?

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval estimation significantly improves efficiency.

CAST: Non-Privileged Clipped Asymmetric Self-Teaching with Advantage Flipping for GRPO

The paper proposes CAST, an answer-free self-distillation method that enhances Group Relative Policy Optimization (GRPO) for verifiable rewards, allowing token-level advantage signals even when all sampled trajectories are uniformly correct or incorrect.

SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition

SpecDB is a novel system that uses LLMs to synthesize highly customized, purpose-built relational databases, achieving performance comparable to commercial systems while significantly reducing code size.

PaCo-VLA: Passivity-Shielded Compliance Prior for Contact-Rich Vision-Language-Action Manipulation

PaCo-VLA introduces a passivity-shielded compliance prior to safely bridge the gap between high-level Vision-Language-Action (VLA) semantic outputs and low-level, force-sensitive robotic control.

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly enhances this personalized empathy.

LaSR: Context-Aware Speech Recognition via Latent Reasoning

The paper proposes LaSR, a context-aware training paradigm that uses latent reasoning to significantly improve speech recognition, especially for specialized terminology, without adding latency.

Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical Memory

The paper introduces MAAD, a multi-agent framework that autonomously transforms software requirements into comprehensive, multi-view architectural blueprints, significantly improving completeness and reducing manual validation.

SIRIUS-SQL: Anchoring Multi-Candidate Text-to-SQL in Execution Feedback

SIRIUS-SQL introduces a robust multi-candidate text-to-SQL system that addresses weaknesses in candidate generation, error handling, and selection, achieving state-of-the-art performance on complex benchmarks.

Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs

The paper introduces APEIRIA, a neuro-symbolic 3D Multi-modal LLM that bridges the gap between interpretable symbolic reasoning and flexible, open-vocabulary 3D understanding.

Unlocking the Black Box of Latent Reasoning: An Interpretability-Guided Approach to Intervention

This paper introduces interpretability-guided, training-free interventions that systematically improve the accuracy and controllability of latent reasoning in LLMs by leveraging structural and causal insights into continuous hidden states.

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly improving performance on complex tasks without external skill generators.

RASER: Recoverability-Aware Selective Escalation Router for Multi-Hop Question Answering

RASER introduces a family of cheap, router-based systems that selectively decide whether to perform expensive multi-hop retrieval, significantly reducing LLM token costs while maintaining state-of-the-art performance.

Training-Free Composed Video Retrieval via Visual Representation-Guided Video-LLM Reasoning

The paper proposes a training-free framework, Visual Representation-Guided Video-LLM Reasoning, to perform composed video retrieval by using visual examples and text instructions, achieving strong performance on the CVPR 2026 challenge.

Community-Aware Assessment of Social Textual Engagement and Resonance: A Human-Centric Perspective on User-Generated Content Evaluation

The paper introduces CASTER, a new human-centric task for evaluating User-Generated Content (UGC) resonance, and proposes MEDEA, an architecture that uses a Social Chain-of-Thought mechanism to simulate community reactions for quality assessment.

Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark

The paper demonstrates that explicit gender cues systematically affect LLM value trade-offs, causing decision flips that are often masked or misattributed by the models themselves.

Outsmarting the Chameleon: Counterfactual Decoupling for Tactical OOD Shifts in Live Streaming Risk Assessment

The paper proposes a novel framework, LPCD, that uses latent causal modeling to robustly assess evolving adversarial risks in live streaming by decoupling malicious intent from superficial tactical shifts.

Benign Inputs, Harmful Outputs: Cross-Modal Jailbreaking via Distributed Semantic Recomposition

The paper introduces Distributed Semantic Recomposition (DSR), a novel cross-modal jailbreaking framework that bypasses existing safety filters by decomposing harmful intent into benign input components, achieving high attack success rates with low input toxicity.

NeuroArmor: Safe-Variant-Guided Representation Consistency for Selective Re-Anchoring in Jailbreak Defense

NeuroArmor is a white-box runtime defense that uses prompt-specific safe variants to selectively detect and mitigate jailbreak attacks, significantly reducing attack success rates while maintaining a low false positive rate.

HORIZON: Recoverability-Governed Curriculum for Physical-Domain Scaling

This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.

Regret Minimization with Adaptive Opponents in Repeated Games

This paper introduces Repeated Policy Regret (RP-Regret), a novel game-theoretic metric for analyzing regret in repeated games with adaptive opponents, and proposes algorithms to minimize it.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIcs.GTRecentJun 4, 2026

Regret Minimization with Adaptive Opponents in Repeated Games

Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang

This paper introduces Repeated Policy Regret (RP-Regret), a novel game-theoretic metric for analyzing regret in repeated games with adaptive opponents, and proposes algorithms to minimize it.

View →
cs.RORecentJun 3, 2026

HORIZON: Recoverability-Governed Curriculum for Physical-Domain Scaling

Chenhao Bai, Liqin Lu, Kaijun Wang, Hui Chen +4 more

This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.

View →
cs.CRcs.AIRecentJun 2, 2026

NeuroArmor: Safe-Variant-Guided Representation Consistency for Selective Re-Anchoring in Jailbreak Defense

Zhongyang Lin, Ziran Zhao, Feifei Zhai, Pengyuan Liu

NeuroArmor is a white-box runtime defense that uses prompt-specific safe variants to selectively detect and mitigate jailbreak attacks, significantly reducing attack success rates while maintaining a…

View →
cs.AIcs.LGRecentJun 1, 2026

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

Zhongyu He, Yuanfan Li, Fei Huang, Tianyu Chen +8 more

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly impro…

View →
cs.AIRecentJun 1, 2026

RASER: Recoverability-Aware Selective Escalation Router for Multi-Hop Question Answering

Yuyang Li, Zihe Yan, Tobias Käfer

RASER introduces a family of cheap, router-based systems that selectively decide whether to perform expensive multi-hop retrieval, significantly reducing LLM token costs while maintaining state-of-the…

View →
cs.CVRecentJun 1, 2026

Training-Free Composed Video Retrieval via Visual Representation-Guided Video-LLM Reasoning

Yang Liu, Qianqian Xu, Peisong Wen, Siran Dai +1 more

The paper proposes a training-free framework, Visual Representation-Guided Video-LLM Reasoning, to perform composed video retrieval by using visual examples and text instructions, achieving strong per…

View →
cs.AIRecentJun 1, 2026

Community-Aware Assessment of Social Textual Engagement and Resonance: A Human-Centric Perspective on User-Generated Content Evaluation

Tianjiao Li, Kai Zhao, Xiang Li, Yang Liu +1 more

The paper introduces CASTER, a new human-centric task for evaluating User-Generated Content (UGC) resonance, and proposes MEDEA, an architecture that uses a Social Chain-of-Thought mechanism to simula…

View →
cs.CLRecentJun 1, 2026

Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark

Yangyang Liu, Dong Yu, Pengyuan Liu

The paper demonstrates that explicit gender cues systematically affect LLM value trade-offs, causing decision flips that are often masked or misattributed by the models themselves.

View →
cs.LGcs.CRRecentJun 1, 2026

Outsmarting the Chameleon: Counterfactual Decoupling for Tactical OOD Shifts in Live Streaming Risk Assessment

Yiran Qiao, Jing Chen, Jiaqi Xu, Yang Liu +2 more

The paper proposes a novel framework, LPCD, that uses latent causal modeling to robustly assess evolving adversarial risks in live streaming by decoupling malicious intent from superficial tactical sh…

View →
cs.CRRecentJun 1, 2026

Benign Inputs, Harmful Outputs: Cross-Modal Jailbreaking via Distributed Semantic Recomposition

Yani Wang, Yilong Yang, Yang Liu, Zhuzhu Wang +2 more

The paper introduces Distributed Semantic Recomposition (DSR), a novel cross-modal jailbreaking framework that bypasses existing safety filters by decomposing harmful intent into benign input componen…

View →
cs.SEcs.AIRecentMay 31, 2026

Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical Memory

Ruiyin Li, Yiran Zhang, Xiyu Zhou, Yangxiao Cai +5 more

The paper introduces MAAD, a multi-agent framework that autonomously transforms software requirements into comprehensive, multi-view architectural blueprints, significantly improving completeness and…

View →
cs.AIRecentMay 31, 2026

SIRIUS-SQL: Anchoring Multi-Candidate Text-to-SQL in Execution Feedback

Leo Luo, Haining Xie, Siqi Shen, Zhipeng Ma +7 more

SIRIUS-SQL introduces a robust multi-candidate text-to-SQL system that addresses weaknesses in candidate generation, error handling, and selection, achieving state-of-the-art performance on complex be…

View →
cs.CVcs.AIcs.CLRecentMay 31, 2026

Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs

Wentao Mo, Yang Liu

The paper introduces APEIRIA, a neuro-symbolic 3D Multi-modal LLM that bridges the gap between interpretable symbolic reasoning and flexible, open-vocabulary 3D understanding.

View →
cs.CLcs.LGRecentMay 31, 2026

Unlocking the Black Box of Latent Reasoning: An Interpretability-Guided Approach to Intervention

Shuochen Chang, Tong Bai, Xiaofeng Zhang, Qianli Ma +4 more

This paper introduces interpretability-guided, training-free interventions that systematically improve the accuracy and controllability of latent reasoning in LLMs by leveraging structural and causal…

View →
cs.ROcs.AIeess.SYRecentMay 30, 2026

PaCo-VLA: Passivity-Shielded Compliance Prior for Contact-Rich Vision-Language-Action Manipulation

Haofan Cao, Zhaoyang Li, Zhichao You, Liang Guo +1 more

PaCo-VLA introduces a passivity-shielded compliance prior to safely bridge the gap between high-level Vision-Language-Action (VLA) semantic outputs and low-level, force-sensitive robotic control.

View →
cs.CLRecentMay 30, 2026

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

Wuqiang Zheng, Chengbing Wang, Yilin Yang, Junyi Cheng +5 more

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly…

View →
cs.CLRecentMay 30, 2026

LaSR: Context-Aware Speech Recognition via Latent Reasoning

Heyang Liu, Ziyang Cheng, Jiayi Huang, Wenyang Xiao +4 more

The paper proposes LaSR, a context-aware training paradigm that uses latent reasoning to significantly improve speech recognition, especially for specialized terminology, without adding latency.

View →
cs.LGcs.AIcs.CLRecentMay 29, 2026

BAGEN: Are LLM Agents Budget-Aware?

Yuxiang Lin, Zihan Wang, Mengyang Liu, Yuxuan Shan +8 more

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval es…

View →
cs.AIRecentMay 29, 2026

CAST: Non-Privileged Clipped Asymmetric Self-Teaching with Advantage Flipping for GRPO

Yang Li, Gongle Xue, Yijia Guo, Yuheng Yuan +2 more

The paper proposes CAST, an answer-free self-distillation method that enhances Group Relative Policy Optimization (GRPO) for verifiable rewards, allowing token-level advantage signals even when all sa…

View →
cs.DBcs.AIRecentMay 29, 2026

SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition

Yunkai Lou, Longbin Lai, Shunyang Li, Zhengping Qian +1 more

SpecDB is a novel system that uses LLMs to synthesize highly customized, purpose-built relational databases, achieving performance comparable to commercial systems while significantly reducing code si…

View →