Xuan Zhang

13 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×12Crypto×5NLP×2Vision×2Info Retrieval×1ML×1

Frequent co-authors

Yuxuan Zhang4×

Wenxuan Zhang2×

Mingxuan Zhang2×

Jiahui Han2×

Dadi Guo2×

Songze Li2×

Research Timeline

2026

A Security Analysis of the OpenClaw AI Agent Framework

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote code execution (RCE) attacks.

TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication

TraceScope is an interactive, sandboxed triage pipeline that analyzes complex phishing URLs by simulating human interaction and verifying suspicious behavior against a detailed checklist, achieving high detection rates even against advanced, evasive threats.

A Comparative Evaluation of AI Agent Security Guardrails

This paper comparatively evaluates DKnownAI Guard against three competitors, demonstrating that DKnownAI Guard achieves superior performance in detecting both agent-specific threats and harmful content.

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

MIRA proposes a novel source-aware filtering framework that discovers and anchors evaluation rubrics during data selection, significantly improving code-oriented mid-training data quality while reducing token usage.

iLoRA: Bayesian Low-Rank Adaptation with Latent Interaction Graphs for Microbiome Diagnosis

iLoRA introduces a novel Bayesian graph-conditioned LoRA framework that jointly learns prediction and latent interaction structure, significantly improving microbiome diagnosis by modeling microbe-microbe cross-talk.

Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces

The paper identifies and demonstrates that post-conclusion continuation in answer-correct long-CoT traces is harmful during LLM fine-tuning, proposing a method to cut this continuation.

PrivacyPeek: Auditing What LLM-Based Agents Acquire, Not Just What They Say

The paper introduces PrivacyPeek, a new benchmark that audits the acquisition stage of LLM-based agents to show that unnecessary and sensitive data acquisition is a widespread and critical privacy vulnerability.

PrivacyPeek: Auditing What LLM-Based Agents Acquire, Not Just What They Say

The paper introduces PrivacyPeek, a new benchmark that audits the acquisition stage of LLM-based agents to demonstrate that unnecessary acquisition of sensitive data is a widespread and critical privacy vulnerability.

Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing

The paper introduces Dr. DocBench, a difficulty-aware, comprehensive benchmark designed to rigorously test expert-level and challenging document parsing capabilities for VLMs, demonstrating that current state-of-the-art models fail on complex, domain-specific structures.

MindClaw: Closed-Loop Embodied Mental-State Reasoning for Precision Intervention

The paper introduces MindClaw, a closed-loop framework that enables embodied agents to perform real-time mental-state reasoning and intervene with precision, significantly outperforming standard VLM baselines.

Collaborative Space Object Detection with Multi-Satellite Viewpoints in LEO Constellations

This paper demonstrates that fusing multi-viewpoint data from multiple satellites significantly enhances the accuracy of space object detection in congested LEO constellations, establishing multi-view fusion as an effective strategy.

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in RL performance.

CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring

This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.

Highlighted terms show continued research focus across papers

Papers

cs.IREmpiricalRecentJun 10, 2026

CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring

Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more

This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.

View →

cs.CLcs.AIRecentJun 2, 2026