Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Peng Wei

Peng Wei

8 indexed papers

Recent (6 mo)
8
With code
0
Influential cites
0
Benchmarked
0

Publications per year

8
26

Top categories

AI×8Crypto×6Society×2Info Retrieval×1NLP×1Vision×1Robotics×1ML×1

Frequent co-authors

Zhipeng Wei5×
Qi Zhang2×
Huanli Gong2×
Yue Dong2×
N. Benjamin Erichson2×
Wesley Shu2×

Research Timeline

2026
Preserving Decision Sovereignty in Military AI: A Trade-Secret-Safe Architectural Framework for Model Replaceability, Human Authority, and State Control

The paper proposes the Energetic Paradigm, a model-agnostic architectural framework that allows states to maintain decision sovereignty and control over military AI systems, even when using proprietary, commercially sourced analytical models.

A Public Theory of Distillation Resistance via Constraint-Coupled Reasoning Architectures

The paper proposes a theoretical framework, called constraint-coupled reasoning, to make AI models less susceptible to knowledge distillation by coupling high-level capabilities to internal stability constraints.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

The paper introduces MT-JailBench, a modular framework for evaluating multi-turn jailbreaks, demonstrating that controlling experimental components like prompt generation and resource budgets is crucial for fair comparison and understanding attack success.

DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs

The paper proposes DMN, a compositional jailbreak framework that utilizes distributed instructions, multimodal evidence, and a number chain task across multiple images to significantly enhance the attack success rate against multimodal LLMs.

Relevant Is Not Warranted: Evidence-Force Calibration for Cited RAG

The paper introduces FORCEBENCH, a new stress test designed to evaluate whether cited sources genuinely warrant the strength of a claim, revealing that standard citation evaluation methods often fail to detect over-strong claims.

D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting

D-Judge introduces a semantics-preserving output rewriting defense that disrupts multi-turn jailbreak attacks by misaligning the feedback signal used by an attacker's judge model.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Highlighted terms show continued research focus across papers

Papers

cs.IRcs.AIcs.CLRecentJun 4, 2026

OneReason Technical Report

OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coheren…

View →
cs.CRcs.AIRecentMay 31, 2026

D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting

Huanli Gong, Zhipeng Wei, Yu Fu, Haz Sameen Shahgir +3 more

D-Judge introduces a semantics-preserving output rewriting defense that disrupts multi-turn jailbreak attacks by misaligning the feedback signal used by an attacker's judge model.

View →
cs.AIRecentMay 27, 2026

Relevant Is Not Warranted: Evidence-Force Calibration for Cited RAG

Pin Qian, Su Wang, Xiaoyuan Wang, Yihang Chen +6 more

The paper introduces FORCEBENCH, a new stress test designed to evaluate whether cited sources genuinely warrant the strength of a claim, revealing that standard citation evaluation methods often fail…

View →
cs.CRcs.AIRecentMay 18, 2026

DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs

Wenzhuo Xu, Zhipeng Wei, Zonghao Ying, Deyue Zhang +3 more

The paper proposes DMN, a compositional jailbreak framework that utilizes distributed instructions, multimodal evidence, and a number chain task across multiple images to significantly enhance the att…

View →
cs.CRcs.AIRecentMay 10, 2026

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

Xinkai Zhang, Zhipeng Wei, Huanli Gong, Jing Ting Zheng +3 more

The paper introduces MT-JailBench, a modular framework for evaluating multi-turn jailbreaks, demonstrating that controlling experimental components like prompt generation and resource budgets is cruci…

View →
cs.CRcs.AIcs.CVRecentMar 28, 2026

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…

View →
cs.CYcs.AIcs.CRRecentMar 26, 2026

Preserving Decision Sovereignty in Military AI: A Trade-Secret-Safe Architectural Framework for Model Replaceability, Human Authority, and State Control

Peng Wei, Wesley Shu

The paper proposes the Energetic Paradigm, a model-agnostic architectural framework that allows states to maintain decision sovereignty and control over military AI systems, even when using proprietar…

View →
cs.AIcs.CRcs.CYRecentMar 26, 2026

A Public Theory of Distillation Resistance via Constraint-Coupled Reasoning Architectures

Peng Wei, Wesley Shu

The paper proposes a theoretical framework, called constraint-coupled reasoning, to make AI models less susceptible to knowledge distillation by coupling high-level capabilities to internal stability…

View →