Bo Zhang

19 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×12Crypto×11NLP×6ML×4Vision×3Robotics×1Prog. Lang.×1Biomolecules×1

Frequent co-authors

Chao Shen4×

Liang He3×

Lei Bai3×

Jing Yang3×

Xia Hu3×

Jing Shao3×

Research Timeline

2026

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

AttnDiff: Attention-based Differential Fingerprinting for Large Language Models

AttnDiff introduces a data-efficient white-box framework that extracts intrinsic attention-based fingerprints to verify the provenance and detect unauthorized derivation of large language models (LLMs) despite common model laundering techniques.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

The paper proposes an operation-centric, TEE-backed isolation model to constrain self-hosted computer-use agents, preventing malicious or unsafe host-level operations without sacrificing general functionality.

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all traditional logs and metadata are lost.

Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion Models

The paper introduces a histogram-regularized latent diffusion model to synthesize highly realistic and subtype-specific pulmonary nodules in 3D CT volumes, addressing the limitations of existing methods that fail to capture accurate lesion-level intensity distributions.

AgentSchool: An LLM-Powered Multi-Agent Simulation for Education

The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and pedagogical reform.

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, quantifying each sample's contribution to an LLM's compliance behavior.

AMix-2: Establishing Protein as a Native Modality in Large Language Models

The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-art performance on comprehensive benchmarks.

Towards Efficient LLMs Annealing with Principled Sample Selection

The paper proposes DiReCT, a novel framework that treats data selection during LLM annealing as a constrained optimization problem based on the spectral geometry of the loss landscape, achieving state-of-the-art performance.

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, preventing the degradation of LLM safety capabilities during fine-tuning.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe actions rather than just the final outcome.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.

Agents-K1: Towards Agent-native Knowledge Orchestration

This paper introduces Agents-K1, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs.

Weakly Non-Negative Supermartingales for Omega-Regular Verification

The paper introduces lazy Streett supermartingales and their lexicographic extension to certify almost-sure satisfaction of omega-regular properties with polynomial templates under a broad class of sampling distributions.

Defense Against LLM Backdoors using Critical Neuron Isolation Pruning

The paper introduces DeCNIP, a method for identifying and neutralizing backdoors in large language models using representational analysis and neuron isolation pruning.

ReferTrack: Referring Then Tracking for Embodied Visual Tracking

The paper introduces ReferTrack, a method for embodied visual tracking using a single forward-facing camera, achieving state-of-the-art performance on EVT-Bench.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIEmpiricalRecentJul 22, 2026

Defense Against LLM Backdoors using Critical Neuron Isolation Pruning

Yuxi Li, Zhibo Zhang, Kailong Wang, Xingshuo Han +2 more

The paper introduces DeCNIP, a method for identifying and neutralizing backdoors in large language models using representational analysis and neuron isolation pruning.

View →

cs.ROEmpiricalRecent