Xin Zhang

18 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×12Crypto×4NLP×3Vision×3ML×3Robotics×2Software Eng.×1HCI×1

Frequent co-authors

Hongxin Zhang2×

Chunru Lin2×

Chuang Gan2×

Junyan Li1×

Zhou Xian1×

Tsun-Hsuan Wang1×

Research Timeline

2026

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

ClawGuard is a novel runtime security framework that deterministically enforces user-confirmed rules at tool-call boundaries to protect LLM agents from indirect prompt injection.

ZK-Value: A Practical Zero-Knowledge System for Verifiable Data Valuation

ZK-Value introduces a practical, scalable zero-knowledge system for calculating data valuations (Shapley values) in data marketplaces, significantly reducing proving time while maintaining high accuracy.

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

SafeHarbor is a novel, hierarchical memory-augmented framework that establishes context-aware decision boundaries for LLM agents, achieving state-of-the-art safety while minimizing over-refusal.

Demystifying Data Organization for Enhanced LLM Training

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM training.

RoboWits: Unexpected Challenges for Robotic Creative Problem Solving

The paper introduces RoboWits, a new bi-manual robotic benchmark designed to test a robot's cognitive reasoning and adaptability to unexpected challenges, revealing that current Vision-Language-Action (VLA) models are brittle when faced with mutated or constrained tasks.

Semantic and Visual Evidence for Efficient Long-Video Reasoning: A Solution for the HD-EPIC VQA Challenge

The paper proposes a unified framework that decouples long-video reasoning into semantic and visual evidence, significantly improving performance on the HD-EPIC VQA Challenge.

Elfs, transducers and quantum walks

This paper introduces Electric Flow Sampling (elfs) as a zero-error quantum walk primitive and uses it to derive improved quantum algorithms for various graph problems, including semi-supervised learning.

ERGeoBench:A Comprehensive Benchmark for Embodied Reasoning and Geo-localization in Multimodal Large Language Models

The paper introduces ERGeoBench, a comprehensive diagnostic benchmark designed to evaluate the fine-grained capabilities of multimodal large language models (MLLMs) for embodied geo-localization across various viewing conditions.

Anchoring LLM Gender Bias to Human Baselines: A Cross-Lingual Audit

The paper audits six LLMs across four languages, finding that their gender stereotyping is significantly wider than human baselines and that cross-lingual translation fundamentally alters the nature of the bias.

Boosting Multimodal Federated Learning via Chained Modality Optimization

The paper proposes FedMChain, a novel federated learning framework that structures multimodal training into sequential phases to mitigate modality competition and improve model performance while reducing communication overhead.

RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

The paper introduces RUBAS, a rubric-based reinforcement learning framework that improves agent safety by providing fine-grained, multi-dimensional rewards for complex tool-use scenarios.

Harvesting AI Computation at the Edge via Generic Approximation

The paper proposes a framework to harvest unused computation resources on AI chips for general-purpose tasks using neural architecture search and approximation techniques.

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

This paper proposes TRIAGE, a role-typed credit assignment framework for agentic reinforcement learning to address the structural incompleteness of standard GRPO.

MADB: A Large-Scale Music Aesthetics Dataset with Professional and Multi-Dimensional Annotations

The paper introduces MADB, a large-scale dataset and benchmark for music aesthetic assessment with 9,999 tracks annotated by 30 trained annotators across 10 perceptual dimensions.

An improved upper bound for the planar Turán number of $C_8$

This paper proves that every simple planar graph with no copy of C8 has at most 69/25(n-2) edges, improving the previous bound.

PeakFlow: Peak-Guided Coarse-to-Refined Modeling for EEG-Based Dynamic Affective Trajectory Prediction

This paper proposes PeakFlow, a framework for dynamic affective trajectory prediction in EEG data using a peak-guided coarse-to-refined approach.

Beyond Fail-to-Pass: Iterative Hardening of Co-Generated Bug Reproduction Tests and Fixes

This paper proposes CoHarden, a co-generation framework for automated program repair that uses a lax signal as an in-loop convergence criterion to prevent lax regressions.

GS-Agent: Creating 4D Physical Worlds With Generative Simulation

This paper introduces GS-Agent, an end-to-end multi-agent framework that generates realistic, dynamic, and controllable 4D physical worlds from natural language descriptions by emulating human creation process using physics engines.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIcs.CLEmpiricalRecentJul 23, 2026

GS-Agent: Creating 4D Physical Worlds With Generative Simulation

Hongxin Zhang, Chunru Lin, Junyan Li, Zhou Xian +2 more

View →

cs.SEcs.AIEmpirical