Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Wei Lu

Wei Lu

8 indexed papers

Recent (6 mo)
8
With code
0
Influential cites
0
Benchmarked
0

Publications per year

8
26

Top categories

NLP×7Crypto×5AI×3ML×2Info Retrieval×2

Frequent co-authors

Wenhang Shi2×
Jinhao Dong2×
Yiren Chen2×
Zhe Zhao2×
Shuqing Bian2×
Xiaoyong Du2×

Research Timeline

2026
Geometry-Aware Localized Watermarking for Copyright Protection in Embedding-as-a-Service

The paper proposes GeoMark, a geometry-aware localized watermarking framework that robustly protects Embedding-as-a-Service (EaaS) against model stealing and copyright infringement while preserving utility.

Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models

The paper introduces Canonical-Context On-Policy Distillation (CCOPD) to improve multi-turn language model performance by mitigating 'self-anchored drift,' ensuring consistent answers regardless of whether the evidence is presented in a single prompt or gradually across multiple turns.

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking across Datasets, Models, and Generated Content

This paper introduces 'implicit identity' as a unifying framework to survey and categorize LLM fingerprinting and watermarking techniques for verifying ownership and provenance across datasets, models, and generated content.

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

The paper introduces DiscourseFlip, a novel graph-guided attack that demonstrates how coordinated poisoning across a multi-topic query space can manipulate the overall opinion generated by black-box Retrieval-Augmented Generation (RAG) systems.

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

The paper introduces DiscourseFlip, a novel black-box, graph-guided attack that manipulates opinions across an entire multi-topic query network, demonstrating a significant leap in scope and effectiveness over existing RAG attack methods.

Scaling Agentic Capabilities via Grounded Interaction Synthesis

The paper introduces Grounded Agentic Interaction Synthesis (GAIS), a framework that generates high-quality, diverse, and complex agentic training data by anchoring tasks to real-world protocols, significantly improving base model performance.

Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning

The paper introduces State-Adaptive Prompt Optimization (SAPO), a novel training strategy that treats prompts as dynamic variables to achieve robust fine-tuning, significantly mitigating catastrophic forgetting and improving generalization in LLMs.

Sequential Data Poisoning in LLM Post-Training

The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are invisible when analyzing individual stages.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CRRecentJun 3, 2026

Sequential Data Poisoning in LLM Post-Training

Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath +1 more

The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are in…

View →
cs.CLRecentJun 1, 2026

Scaling Agentic Capabilities via Grounded Interaction Synthesis

Wenhang Shi, Jinhao Dong, Yiren Chen, Zhe Zhao +3 more

The paper introduces Grounded Agentic Interaction Synthesis (GAIS), a framework that generates high-quality, diverse, and complex agentic training data by anchoring tasks to real-world protocols, sign…

View →
cs.CLRecentJun 1, 2026

Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning

Wenhang Shi, Yiren Chen, Shuqing Bian, Zhe Zhao +4 more

The paper introduces State-Adaptive Prompt Optimization (SAPO), a novel training strategy that treats prompts as dynamic variables to achieve robust fine-tuning, significantly mitigating catastrophic…

View →
cs.CLcs.AIcs.CRRecentMay 31, 2026

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

Yuyang Gong, Miaokun Chen, Jiawei Liu, Zhuo Chen +4 more

The paper introduces DiscourseFlip, a novel graph-guided attack that demonstrates how coordinated poisoning across a multi-topic query space can manipulate the overall opinion generated by black-box R…

View →
cs.CLcs.AIcs.CRRecentMay 31, 2026

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

Yuyang Gong, Miaokun Chen, Jiawei Liu, Zhuo Chen +4 more

The paper introduces DiscourseFlip, a novel black-box, graph-guided attack that manipulates opinions across an entire multi-topic query network, demonstrating a significant leap in scope and effective…

View →
cs.CLcs.AIRecentMay 28, 2026

Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models

Zizhuo Lin, Quanling Liu, Jinsheng Quan, Chao Zhang +5 more

The paper introduces Canonical-Context On-Policy Distillation (CCOPD) to improve multi-turn language model performance by mitigating 'self-anchored drift,' ensuring consistent answers regardless of wh…

View →
cs.CRcs.CLcs.LGRecentMay 28, 2026

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking across Datasets, Models, and Generated Content

Bing Liu, Shunping Wang, Yufan Zhu, Xinyi Yu +4 more

This paper introduces 'implicit identity' as a unifying framework to survey and categorize LLM fingerprinting and watermarking techniques for verifying ownership and provenance across datasets, models…

View →
cs.CRcs.CLRecentApr 13, 2026

Geometry-Aware Localized Watermarking for Copyright Protection in Embedding-as-a-Service

Zhimin Chen, Xiaojie Liang, Wenbo Xu, Yuxuan Liu +1 more

The paper proposes GeoMark, a geometry-aware localized watermarking framework that robustly protects Embedding-as-a-Service (EaaS) against model stealing and copyright infringement while preserving ut…

View →