Jie Zhang

30 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×16Crypto×14ML×8Vision×7NLP×5Robotics×3Signal Processing×2Neural Computing×1

Frequent co-authors

Mengjie Zhang2×

Wenhao Li2×

Xueying Jiang2×

Quanhao Qian2×

Deli Zhao2×

Ran Xu2×

Research Timeline

2026

Watermarking Should Be Treated as a Monitoring Primitive

The paper argues that watermarking must be viewed as a monitoring primitive, introducing an observer-based threat model that shows even zero-bit watermarking can enable entity-level attribution through signal aggregation.

The Cases LJP Never Sees: Prosecution Decision Prediction for More Complete Criminal Liability Assessment

The paper introduces Prosecution Decision Prediction (PDP), a new legal AI task that assesses prosecutorial review decisions, showing that current state-of-the-art LLMs perform significantly worse on this task than on standard judgment prediction.

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic tasks and embodiments.

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

LongTraceRL addresses long-context reasoning challenges by generating highly challenging training data and introducing a fine-grained rubric reward, significantly improving evidence-grounded reasoning in LLMs.

FlowTime: Towards Continuous Generative Watch Time Prediction via Flow-based Personalized Priors

FlowTime proposes a novel Continuous Generative Regression framework using a Flow-based Personalized Prior to accurately model the multimodal and heterogeneous nature of user watch time prediction, significantly outperforming existing state-of-the-art methods.

Deft Scheduling of Dynamic Cloud Workflows with Varying Deadlines via Mixture-of-Experts

The paper introduces DEFT, a novel Mixture-of-Experts DRL architecture, to intelligently schedule dynamic cloud workflows with varying deadlines, significantly improving performance over existing single-path schedulers.

Thinking Economically: A Hierarchical Framework for Adaptive-Complexity Reasoning in LLMs

The paper introduces Hierarchical Adaptive Budgeter (HAB), a framework that improves LLM reasoning efficiency by adaptively allocating computational resources to match the intrinsic complexity of both problems and individual reasoning steps.

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models

STaR-KV introduces a novel, training-free KV cache compression framework that adaptively re-weights token importance across spatial, temporal, and distributional axes, significantly reducing GPU memory usage for GUI vision-language models while maintaining high accuracy.

What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems

The paper introduces and analyzes cross-session stored prompt injection, demonstrating that persistent system state transforms prompt injection from a temporary model-level threat into a long-lived, system-level vulnerability in agentic systems.

AgentCyberRange: Benchmarking Frontier AI Systems in Realistic Cyber Ranges

The paper introduces AgentCyberRange, an open, multi-range infrastructure for measuring autonomous cyber attack capability in realistic cyber ranges, and evaluates six frontier AI systems.

Access Selection for Finite-SNR Modal Recoverability in Sampled-Wave Receivers

This paper formulates receiver selection for large-aperture wave receivers as a modal-recoverability problem and proposes a framework with three nested recoverability criteria for individual modal degrees, joint target subspace, and target modes.

SkillOpt-Lite: Better and Faster Agent Self-evolution via One Line of Vibe

This paper proposes SkillOpt-Lite, a minimal viable pipeline for skill optimization in autonomous agents, which accelerates convergence and outperforms full SkillOpt.

From Fixed to Free Cameras: Calibration-Free View-Robust Vision-Language-Action Model

This paper introduces Camera-Centric VLA, a new model for Vision-Language-Action policies that predicts camera-centric actions and hand-eye matrix, allowing the policy to figure out camera geometry on its own.

Infinite Worlds with Versatile Interactions

The paper introduces LingBot-World 2.0, an advanced version of a language model with unbounded interaction horizon, rapid response time, diverse interactive elements, and agentic harness integration.

TRM-Raft: A Byzantine-Resistant Raft Consensus via Integrated Trust and Reputation Model

This paper proposes TRM-Raft, a Byzantine-resistant enhancement for Raft consensus that integrates a Blockchain-based Trust and Reputation Model to prevent election forgery and log tampering.

Move First, Commit Later: Selective LiDAR-to-BIM Global Initialization via Sequential Consensus with Symmetry-Aware Abstention

The paper introduces Move First, Commit Later, a selective layer for Global LiDAR-to-BIM initialization that decides whether to commit to a registration based on evidence from multiple submaps, reducing the issue of confident aliasing.

Massive MIMO-OFDM ISAC for Sparse ISAR Imaging: Joint Power and Subcarrier Allocation

This paper proposes a framework for integrated sensing and communication (ISAC) using mMIMO OFDM and ISAR imaging, developing an adaptive ADMM algorithm for high-resolution image recovery and a joint resource-allocation method.

End-to-End Markov State Sequence Learning for Auditory Attention Decoding

This paper proposes an end-to-end Markov framework for auditory attention decoding using conditional random fields and an EEG--speech correlation backbone.

Search Hardness-Aware LLM-Based Problem Formulation for Expensive Simulation-Driven Design

This paper proposes SHA-PF, a search hardness-aware LLM-based problem formulation framework for expensive simulation-driven design, which prioritizes rare samples with greater progress potential and requires significantly fewer evaluations to reach design requirements.

3D-Aware VLMs with Implicit and Explicit Geometries

The paper introduces VLM-IE3D, a framework that enhances 2D vision-language models with implicit and explicit 3D geometries learned from RGB videos, achieving superior performance on various 3D tasks.

Highlighted terms show continued research focus across papers

Papers

cs.NEEmpiricalRecentJul 23, 2026

Search Hardness-Aware LLM-Based Problem Formulation for Expensive Simulation-Driven Design

Yuchen Li, Handing Wang, Bing Xue, Mengjie Zhang

View →

cs.CVcs.AIcs.LGEmpirical