Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Qi Liu

Qi Liu

9 indexed papers

Recent (6 mo)
9
With code
0
Influential cites
0
Benchmarked
0

Publications per year

9
26

Top categories

AI×5NLP×2Crypto×2Robotics×1ML×1Multiagent×1

Frequent co-authors

Jiaqi Liu2×
Dong Jing1×
Jingchen Nie1×
Tianqi Zhang1×
Huaxiu Yao1×
Zhiwu Lu1×

Research Timeline

2026
Ciphertext-Policy ABE for $\mathsf{NC}^1$ Circuits with Constant-Size Ciphertexts from Succinct LWE

The paper presents a lattice-based Ciphertext-Policy Attribute-Based Encryption (CP-ABE) scheme that supports $\mathsf{NC}^1$ access policies while maintaining constant-size ciphertexts.

Breaking the Secret: Economic Interventions for Combating Collusion in Embodied Multi-Agent Systems

The paper proposes a mutagenic incentive intervention approach that mitigates collusion in embodied multi-agent systems by reshaping agents' payoff structures, effectively inducing defection and maintaining system efficiency.

EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents

The paper introduces EgoBench, the first interactive multimodal benchmark designed to jointly evaluate advanced AI agents' capabilities in visual perception, multi-hop reasoning, and dynamic tool usage in real-world, egocentric scenarios.

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

The paper proposes EKSFT, a selective fine-tuning method that masks high-entropy or high-KL divergence tokens during Supervised Fine-Tuning (SFT) to prevent distribution shift and improve subsequent Reinforcement Learning (RL) performance.

Configurable Reward Model for Balanced Safety Alignment

The paper introduces the Configurable Safety Reward Model (CSRM), a novel reward model that can be jointly optimized for calibrated safety compliance and reward modeling, significantly improving LLM safety alignment across diverse and unseen safety configurations.

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

The paper introduces SAVE, a framework that uses on-policy feedback and the value function to self-supervise and improve reward models, significantly enhancing RLHF performance across multiple benchmarks.

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgetting and enhancing robustness.

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validation and submission.

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIRecentJun 4, 2026

TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more

TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.

View →
cs.LGcs.AIRecentJun 1, 2026

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

Ran Liu, Min Yu, Mingqi Liu, Jianguo Jiang +6 more

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgett…

View →
cs.AIRecentJun 1, 2026

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Junqi Liu, Salena Song, Yuhan Wang, Jiawei Mao +11 more

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validatio…

View →
cs.CLRecentMay 29, 2026

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Xiaobo Wang, Tong Wu, Min Tang, Jiaqi Li +2 more

The paper introduces SAVE, a framework that uses on-policy feedback and the value function to self-supervise and improve reward models, significantly enhancing RLHF performance across multiple benchma…

View →
cs.AIRecentMay 28, 2026

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Qi Liu, Mingdi Sun, Yongyi He, Zhi Zheng +4 more

The paper proposes EKSFT, a selective fine-tuning method that masks high-entropy or high-KL divergence tokens during Supervised Fine-Tuning (SFT) to prevent distribution shift and improve subsequent R…

View →
cs.CLRecentMay 28, 2026

Configurable Reward Model for Balanced Safety Alignment

Zhengping Jiang, Mehran Khodabandeh, Akash Bharadwaj, Manik Bhandari +4 more

The paper introduces the Configurable Safety Reward Model (CSRM), a novel reward model that can be jointly optimized for calibrated safety compliance and reward modeling, significantly improving LLM s…

View →
cs.AIRecentMay 27, 2026

EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents

Yunqi Liu, Tong Niu, Zitong Wang, Zhenlong Dai +3 more

The paper introduces EgoBench, the first interactive multimodal benchmark designed to jointly evaluate advanced AI agents' capabilities in visual perception, multi-hop reasoning, and dynamic tool usag…

View →
cs.CRcs.MARecentApr 26, 2026

Breaking the Secret: Economic Interventions for Combating Collusion in Embodied Multi-Agent Systems

Qi Liu, Xiaohui Chen, Zhihui Zhao, Yaowen Zheng +4 more

The paper proposes a mutagenic incentive intervention approach that mitigates collusion in embodied multi-agent systems by reshaping agents' payoff structures, effectively inducing defection and maint…

View →
cs.CRRecentMar 17, 2026

Ciphertext-Policy ABE for $\mathsf{NC}^1$ Circuits with Constant-Size Ciphertexts from Succinct LWE

Jiaqi Liu, Yuanyi Zhang, Fang-Wei Fu

The paper presents a lattice-based Ciphertext-Policy Attribute-Based Encryption (CP-ABE) scheme that supports $\mathsf{NC}^1$ access policies while maintaining constant-size ciphertexts.

View →