Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Bo Zhang

Bo Zhang

16 indexed papers

Recent (6 mo)
16
With code
0
Influential cites
0
Benchmarked
0

Publications per year

16
26

Top categories

AI×11Crypto×10NLP×6ML×4Vision×3Biomolecules×1Multiagent×1

Frequent co-authors

Chao Shen4×
Liang He3×
Lei Bai3×
Jing Yang3×
Xia Hu3×
Jing Shao3×

Research Timeline

2026
When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

AttnDiff: Attention-based Differential Fingerprinting for Large Language Models

AttnDiff introduces a data-efficient white-box framework that extracts intrinsic attention-based fingerprints to verify the provenance and detect unauthorized derivation of large language models (LLMs) despite common model laundering techniques.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

The paper proposes an operation-centric, TEE-backed isolation model to constrain self-hosted computer-use agents, preventing malicious or unsafe host-level operations without sacrificing general functionality.

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all traditional logs and metadata are lost.

Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion Models

The paper introduces a histogram-regularized latent diffusion model to synthesize highly realistic and subtype-specific pulmonary nodules in 3D CT volumes, addressing the limitations of existing methods that fail to capture accurate lesion-level intensity distributions.

AgentSchool: An LLM-Powered Multi-Agent Simulation for Education

The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and pedagogical reform.

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, quantifying each sample's contribution to an LLM's compliance behavior.

AMix-2: Establishing Protein as a Native Modality in Large Language Models

The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-art performance on comprehensive benchmarks.

Towards Efficient LLMs Annealing with Principled Sample Selection

The paper proposes DiReCT, a novel framework that treats data selection during LLM annealing as a constrained optimization problem based on the spectral geometry of the loss landscape, achieving state-of-the-art performance.

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, preventing the degradation of LLM safety capabilities during fine-tuning.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe actions rather than just the final outcome.

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.

Agents-K1: Towards Agent-native Knowledge Orchestration

This paper introduces Agents-K1, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs.

Highlighted terms show continued research focus across papers

Papers

cs.AIEmpiricalRecentJun 11, 2026

Agents-K1: Towards Agent-native Knowledge Orchestration

Zongsheng Cao, Bihao Zhan, Jinxin Shi, Jiong Wang +21 more

This paper introduces Agents-K1, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs.

View →
cs.AIcs.CLRecentJun 4, 2026

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

Shangheng Du, Xiangchao Yan, Jinxin Shi, Zongsheng Cao +10 more

MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.

View →
cs.CRcs.AIRecentJun 1, 2026

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

Hao Cheng, Changtao Miao, Tianle Song, Yin Wu +20 more

SeClaw is a new framework that synthesizes security tasks from structured risk specifications to evaluate autonomous LLM agents' behavior in stateful environments, focusing on the process of unsafe ac…

View →
cs.CRcs.AIRecentJun 1, 2026

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

Hao Cheng, Changtao Miao, Tianle Song, Yin Wu +20 more

SeClaw is a new framework that uses specification-driven task synthesis to create comprehensive and controllable security benchmarks for evaluating the unsafe behaviors of autonomous LLM agents.

View →
cs.CRcs.AIcs.CLRecentMay 29, 2026

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

Junbo Zhang, Qianli Zhou, Xinyang Deng, Wen Jiang +2 more

DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, quantifying each sample's contribution to an LLM's compliance behavior.

View →
q-bio.BMcs.AIRecentMay 29, 2026

AMix-2: Establishing Protein as a Native Modality in Large Language Models

Keyue Qiu, Yixin Wu, Lihao Wang, Yawen Ouyang +18 more

The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-a…

View →
cs.CLRecentMay 29, 2026

Towards Efficient LLMs Annealing with Principled Sample Selection

Yuanjian Xu, Jianing Hao, Wanbo Zhang, Zhong Li +1 more

The paper proposes DiReCT, a novel framework that treats data selection during LLM annealing as a constrained optimization problem based on the spectral geometry of the loss landscape, achieving state…

View →
cs.CRcs.AIcs.CLRecentMay 29, 2026

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

Junbo Zhang, Qianli Zhou, Xinyang Deng, Wen Jiang +2 more

DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, preventing the degradation of LLM safety capabilities during fine-tuning.

View →
cs.CVcs.AIcs.LGRecentMay 28, 2026

Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion Models

Arunkumar Kannan, Yanbo Zhang, Han Liu, Michael Baumgartner +4 more

The paper introduces a histogram-regularized latent diffusion model to synthesize highly realistic and subtype-specific pulmonary nodules in 3D CT volumes, addressing the limitations of existing metho…

View →
cs.AIcs.MARecentMay 28, 2026

AgentSchool: An LLM-Powered Multi-Agent Simulation for Education

Yulei Ye, Wenhao Li, Zhong Wen, Yunshu Huang +22 more

The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and ped…

View →
cs.AIcs.CLcs.CRRecentMay 28, 2026

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.

View →
cs.AIcs.CLcs.CRRecentMay 28, 2026

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more

The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.

View →
cs.CRRecentMay 24, 2026

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

Haobo Zhang, Xutao Mao, Guangyuan Dong, Ziwei Li +4 more

MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all t…

View →
cs.CRRecentMay 7, 2026

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

Di Lu, Bo Zhang, Xiyuan Li, Yongzhi Liao +4 more

The paper proposes an operation-centric, TEE-backed isolation model to constrain self-hosted computer-use agents, preventing malicious or unsafe host-level operations without sacrificing general funct…

View →
cs.CRcs.LGRecentApr 7, 2026

AttnDiff: Attention-based Differential Fingerprinting for Large Language Models

Haobo Zhang, Zhenhua Xu, Junxian Li, Shangfeng Sheng +2 more

AttnDiff introduces a data-efficient white-box framework that extracts intrinsic attention-based fingerprints to verify the provenance and detect unauthorized derivation of large language models (LLMs…

View →
cs.CRRecentApr 1, 2026

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

Jiaqing Li, Zhibo Zhang, Shide Zhou, Yuxi Li +2 more

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

View →