Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Xin Li

Xin Li

29 indexed papers

Recent (6 mo)
29
With code
0
Influential cites
0
Benchmarked
0

Publications per year

29
26

Top categories

AI×22Crypto×13NLP×9ML×5Vision×4Info Retrieval×2Software Eng.×2Theoretical Economics×1

Frequent co-authors

Yuexin Li3×
Yulin Chen3×
Yufei He3×
Tri Cao3×
Bryan Hooi3×
Ziming Li2×

Research Timeline

2026
Plant, Persist, Trigger: Sleeper Attack on Large Language Model Agents

This paper introduces the concept of 'Sleeper Attack,' demonstrating that adversarial content can persist across multiple interactions with an LLM agent, posing a more subtle and difficult-to-detect safety threat than single-interaction attacks.

MACReD: A Multi-Agent Collaborative Reasoning Framework for Reaction Diagram Parsing

MACReD introduces a hierarchical multi-agent framework that achieves state-of-the-art performance in parsing complex chemical reaction diagrams by coordinating specialized agents for perception and global reasoning.

Cycle-Space Informed Detection of Autoencoded Blind False Data Injection Attacks on Power Systems

The paper proposes a Cycle-Space Detector (CSD) that uses network topology constraints to effectively detect stealthy, data-driven False Data Injection Attacks (FDIA) that exploit the null space of measurement data.

Reinforcement Learning with Robust Rubric Rewards

The paper introduces $ ext{RLR}^3$, a novel framework that extends verifiable rewards in Reinforcement Learning to handle partially verifiable, multi-criteria vision-language tasks by integrating robust rubric scoring.

CRITIC-R1: Learning Structured Critics for Retrieval-Augmented Generation

CRITIC-R1 introduces a structured critic framework that treats RAG critique as an explicit error diagnosis problem using reinforcement learning, significantly improving answer quality over strong RAG baselines.

Quantifying and Optimizing Simplicity via Polynomial Representations

The paper introduces polynomial representations as a quantitative, distribution-aware metric for measuring model simplicity, demonstrating that the effective degree of this representation is a superior predictor of generalization compared to existing proxies.

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

AliMark proposes a novel watermarking framework that treats sentence-level watermarking as a bit sequence alignment problem, significantly enhancing robustness against structural text perturbations like sentence splitting and merging.

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

AliMark proposes a novel framework that enhances the robustness of sentence-level watermarking by reformulating the problem as a bit sequence encoding and alignment task, significantly improving resilience against structural text perturbations like sentence splitting and merging.

Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG

The paper proposes the Morlet Spectral Transformer (MST), a novel architecture that effectively decodes cross-subject emotion from EEG by designing specialized spectral and spatial representations, outperforming existing large foundation models.

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

The paper introduces Moment-Video, a new benchmark that diagnoses the ability of video MLLMs to understand brief, critical visual events, revealing that current models struggle significantly with temporal fidelity.

InfoMerge: Information-aware Token Compression for Efficient Video Large Language Models

InfoMerge is a novel, training-free method that significantly compresses visual tokens for Video-LLMs by estimating temporal redundancy and allocating tokens based on content richness, achieving high efficiency with minimal performance loss.

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence

The paper introduces SPADE-Bench, a new benchmark designed to rigorously evaluate 'agent deception'—the divergence between an agent's reported plan and its actual executed actions—which is a critical safety issue for autonomous LLM agents.

CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback

The paper proposes Credit-Attenuated Privileged Feedback (CAPF), a training-time mechanism that uses verifier-side information to guide LLM search agents, significantly improving their performance on complex QA tasks.

Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization

The paper proposes Joint Neighborhood Optimization (JNO), a novel knowledge-editing framework that jointly addresses the coupled pressures of desirable knowledge propagation and unintended knowledge leakage during single-edit updates in LLMs.

Joint Agent Memory and Exploration Learning via Novelty Signals

The JAMEL framework addresses the challenge of effective exploration in open-ended environments by jointly training agent memory and exploration policies using natural, novelty-driven signals.

Privacy-preserving Information Sharing in Oligopoly Competitions

The paper analyzes information-sharing mechanisms in oligopolies, finding that privacy protection alone is insufficient to incentivize suppliers to share data; successful sharing requires combining privacy safeguards with a sufficiently informative external signal.

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in RL performance.

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding

This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.

Highlighted terms show continued research focus across papers

Papers

cs.IREmpiricalRecentJun 10, 2026

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding

Fuwei Zhang, Yanzhao Zhang, Mingxin Li, Dingkun Long +4 more

This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.

View →
cs.AIcs.CLRecentJun 4, 2026

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

Shangheng Du, Xiangchao Yan, Jinxin Shi, Zongsheng Cao +10 more

MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.

View →
cs.IRcs.AIcs.CLRecentJun 4, 2026

OneReason Technical Report

OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coheren…

View →
cs.CLcs.AIRecentJun 2, 2026

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

Rongzhi Zhang, Rui Feng, Zhihan Zhang, Jingfeng Yang +7 more

QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in…

View →
cs.CVcs.AIRecentJun 1, 2026

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

Xiaolin Liu, Yilun Zhu, Xiangyu Zhao, Xuehui Wang +8 more

The paper introduces Moment-Video, a new benchmark that diagnoses the ability of video MLLMs to understand brief, critical visual events, revealing that current models struggle significantly with temp…

View →
cs.CVcs.CLRecentJun 1, 2026

InfoMerge: Information-aware Token Compression for Efficient Video Large Language Models

Xinxin Liu, Shiwei Gan, Xiao Liu, Yafeng Yin +2 more

InfoMerge is a novel, training-free method that significantly compresses visual tokens for Video-LLMs by estimating temporal redundancy and allocating tokens based on content richness, achieving high…

View →
cs.CLcs.AIRecentJun 1, 2026

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence

Yuyan Bu, Haowei Li, Qirui Zheng, Bowen Dong +6 more

The paper introduces SPADE-Bench, a new benchmark designed to rigorously evaluate 'agent deception'—the divergence between an agent's reported plan and its actual executed actions—which is a critical…

View →
cs.AIRecentJun 1, 2026

CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback

Bin Chen, Xinye Liao, Yiming Liu, Xin Liao +1 more

The paper proposes Credit-Attenuated Privileged Feedback (CAPF), a training-time mechanism that uses verifier-side information to guide LLM search agents, significantly improving their performance on…

View →
cs.AIRecentJun 1, 2026

Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization

Haoben Huang, Shuxin Liu, Ou Wu, Di Gao

The paper proposes Joint Neighborhood Optimization (JNO), a novel knowledge-editing framework that jointly addresses the coupled pressures of desirable knowledge propagation and unintended knowledge l…

View →
cs.AIRecentJun 1, 2026

Joint Agent Memory and Exploration Learning via Novelty Signals

Shizuo Tian, Xiaohong Weng, Rui Kong, Yuxuan Chen +8 more

The JAMEL framework addresses the challenge of effective exploration in open-ended environments by jointly training agent memory and exploration policies using natural, novelty-driven signals.

View →
econ.THcs.CRcs.CYRecentJun 1, 2026

Privacy-preserving Information Sharing in Oligopoly Competitions

Yuxin Liu, M. Amin Rahimian

The paper analyzes information-sharing mechanisms in oligopolies, finding that privacy protection alone is insufficient to incentivize suppliers to share data; successful sharing requires combining pr…

View →
cs.LGcs.AIRecentMay 30, 2026

Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG

Jiaxin Qing, Lexin Li

The paper proposes the Morlet Spectral Transformer (MST), a novel architecture that effectively decodes cross-subject emotion from EEG by designing specialized spectral and spatial representations, ou…

View →
cs.CVcs.AIRecentMay 28, 2026

Reinforcement Learning with Robust Rubric Rewards

Ya-Qi Yu, Hao Wang, Fangyu Hong, Xiangyang Qu +14 more

The paper introduces $ ext{RLR}^3$, a novel framework that extends verifiable rewards in Reinforcement Learning to handle partially verifiable, multi-criteria vision-language tasks by integrating robu…

View →
cs.CLcs.AIRecentMay 28, 2026

CRITIC-R1: Learning Structured Critics for Retrieval-Augmented Generation

Wenhan Xiao, Ziwei Zhang, Chuanyue Yu, Xingcheng Fu +3 more

CRITIC-R1 introduces a structured critic framework that treats RAG critique as an explicit error diagnosis problem using reinforcement learning, significantly improving answer quality over strong RAG…

View →
cs.AIRecentMay 28, 2026

Quantifying and Optimizing Simplicity via Polynomial Representations

Tianren Zhang, Xiangxin Li, Minghao Xiao, Guanyu Chen +1 more

The paper introduces polynomial representations as a quantitative, distribution-aware metric for measuring model simplicity, demonstrating that the effective degree of this representation is a superio…

View →
cs.CRcs.AIcs.CLRecentMay 28, 2026

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

Yuexin Li, Wenjie Qu, Linyu Wu, Yulin Chen +4 more

AliMark proposes a novel watermarking framework that treats sentence-level watermarking as a bit sequence alignment problem, significantly enhancing robustness against structural text perturbations li…

View →
cs.CRcs.AIcs.CLRecentMay 28, 2026

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

Yuexin Li, Wenjie Qu, Linyu Wu, Yulin Chen +4 more

AliMark proposes a novel framework that enhances the robustness of sentence-level watermarking by reformulating the problem as a bit sequence encoding and alignment task, significantly improving resil…

View →
cs.AIRecentMay 27, 2026

Plant, Persist, Trigger: Sleeper Attack on Large Language Model Agents

Yongxiang Li, Moxin Li, Zhixin Ma, Fengbin Zhu +3 more

This paper introduces the concept of 'Sleeper Attack,' demonstrating that adversarial content can persist across multiple interactions with an LLM agent, posing a more subtle and difficult-to-detect s…

View →
cs.AIRecentMay 27, 2026

MACReD: A Multi-Agent Collaborative Reasoning Framework for Reaction Diagram Parsing

Chuang Tang, Chenhao Lin, Yin Xu, Hao Wang +4 more

MACReD introduces a hierarchical multi-agent framework that achieves state-of-the-art performance in parsing complex chemical reaction diagrams by coordinating specialized agents for perception and gl…

View →
cs.LGcs.CRRecentMay 27, 2026

Cycle-Space Informed Detection of Autoencoded Blind False Data Injection Attacks on Power Systems

Xin Li, Chenhan Xiao, Jonathan Cohen, Aviad Elyashar +2 more

The paper proposes a Cycle-Space Detector (CSD) that uses network topology constraints to effectively detect stealthy, data-driven False Data Injection Attacks (FDIA) that exploit the null space of me…

View →