Peng Wei

9 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×8Crypto×6Robotics×2ML×2Society×2Multiagent×1Systems and Control×1Info Retrieval×1

Frequent co-authors

Zhipeng Wei5×

Qi Zhang2×

Huanli Gong2×

Yue Dong2×

N. Benjamin Erichson2×

Wesley Shu2×

Research Timeline

2026

Preserving Decision Sovereignty in Military AI: A Trade-Secret-Safe Architectural Framework for Model Replaceability, Human Authority, and State Control

The paper proposes the Energetic Paradigm, a model-agnostic architectural framework that allows states to maintain decision sovereignty and control over military AI systems, even when using proprietary, commercially sourced analytical models.

A Public Theory of Distillation Resistance via Constraint-Coupled Reasoning Architectures

The paper proposes a theoretical framework, called constraint-coupled reasoning, to make AI models less susceptible to knowledge distillation by coupling high-level capabilities to internal stability constraints.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

The paper introduces MT-JailBench, a modular framework for evaluating multi-turn jailbreaks, demonstrating that controlling experimental components like prompt generation and resource budgets is crucial for fair comparison and understanding attack success.

DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs

The paper proposes DMN, a compositional jailbreak framework that utilizes distributed instructions, multimodal evidence, and a number chain task across multiple images to significantly enhance the attack success rate against multimodal LLMs.

Relevant Is Not Warranted: Evidence-Force Calibration for Cited RAG

The paper introduces FORCEBENCH, a new stress test designed to evaluate whether cited sources genuinely warrant the strength of a claim, revealing that standard citation evaluation methods often fail to detect over-strong claims.

D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting

D-Judge introduces a semantics-preserving output rewriting defense that disrupts multi-turn jailbreak attacks by misaligning the feedback signal used by an attacker's judge model.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Runtime Safety Filtering for Learned Small UAS Separation Policies under GNSS Degradation

This paper evaluates two approaches for maintaining safe separation between small Unmanned Aircraft Systems (sUAS) in urban environments with degraded Global Navigation Satellite System (GNSS) signals: filtering the policy's actions or its observations. The results show that observation filtering significantly reduces mid-air collisions and remains robust to the tradeoff between separation distance and closing rate.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.LGcs.MAEmpiricalRecentJul 10, 2026

Runtime Safety Filtering for Learned Small UAS Separation Policies under GNSS Degradation

Alex Zongo, Peng Wei

View →

cs.IRcs.AIcs.CLRecent