Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Kai Zhang

Kai Zhang

8 indexed papers

Recent (6 mo)
8
With code
0
Influential cites
0
Benchmarked
0

Publications per year

8
26

Top categories

AI×7Crypto×5Vision×2Robotics×1ML×1NLP×1

Frequent co-authors

Jiahao Xu2×
Rui Hu2×
Olivera Kotevska2×
Zikai Zhang2×
Yiqi Wang1×
Jiaqi Zhang1×

Research Timeline

2026
SelfGrader: LLM Jailbreak Detection via Anchored Token-Level Logits

SelfGrader proposes a lightweight, robust guardrail for detecting LLM jailbreaks by formulating the detection problem as a numerical grading task using anchored token-level logits, achieving strong performance across various benchmarks.

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited token context.

ClawGuard: Out-of-Band Detection of LLM Agent Workflow Hijacking via EM Side Channel

ClawGuard introduces a passive, out-of-band security monitor that detects LLM agent workflow hijacking by analyzing unique electromagnetic (EM) emanations generated during agent skill execution.

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

The paper introduces MT-JailBench, a modular framework for evaluating multi-turn jailbreaks, demonstrating that controlling experimental components like prompt generation and resource budgets is crucial for fair comparison and understanding attack success.

From Fact Overwriting to Knowledge Evolution: Causal Editing via On-Policy Self-Distillation

The paper introduces Causal Editing (CODE), a new paradigm that improves knowledge updates in LLMs by grounding fact injection in causal narratives, drastically reducing self-refutation rates.

CardioLens: Revealing the Clinical Reality Gap of MLLMs via Multi-Sequence Cardiac MRI Evaluations

The paper introduces CardioLens, a rigorous evaluation testbed for multi-sequence Cardiac MRI, which reveals that current Multimodal Large Language Models (MLLMs) exhibit a significant 'clinical reality gap' and perform poorly when simulating real-world cardiac interpretation workflows.

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

The paper introduces Humanoid-GPT, a large-scale generative Transformer model that achieves robust zero-shot motion tracking and control by training on a massive, unified corpus of motion data.

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentJun 3, 2026

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

Yiqi Wang, Jiaqi Zhang, Taotao Cai, Zirui Liu +5 more

This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.

View →
cs.ROcs.AIcs.CVRecentJun 2, 2026

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Zekun Qi, Xuchuan Chen, Dairu Liu, Chenghuai Lin +9 more

The paper introduces Humanoid-GPT, a large-scale generative Transformer model that achieves robust zero-shot motion tracking and control by training on a massive, unified corpus of motion data.

View →
cs.CVcs.AIcs.LGRecentMay 28, 2026

CardioLens: Revealing the Clinical Reality Gap of MLLMs via Multi-Sequence Cardiac MRI Evaluations

Zixian Su, Hongkai Zhang, Fan Gao, Encheng Su +11 more

The paper introduces CardioLens, a rigorous evaluation testbed for multi-sequence Cardiac MRI, which reveals that current Multimodal Large Language Models (MLLMs) exhibit a significant 'clinical reali…

View →
cs.AIRecentMay 27, 2026

From Fact Overwriting to Knowledge Evolution: Causal Editing via On-Policy Self-Distillation

Shuaike Li, Kai Zhang, Xianquan Wang, Jiachen Liu +1 more

The paper introduces Causal Editing (CODE), a new paradigm that improves knowledge updates in LLMs by grounding fact injection in causal narratives, drastically reducing self-refutation rates.

View →
cs.CRcs.AIRecentMay 10, 2026

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

Xinkai Zhang, Zhipeng Wei, Huanli Gong, Jing Ting Zheng +3 more

The paper introduces MT-JailBench, a modular framework for evaluating multi-turn jailbreaks, demonstrating that controlling experimental components like prompt generation and resource budgets is cruci…

View →
cs.CRRecentMay 7, 2026

ClawGuard: Out-of-Band Detection of LLM Agent Workflow Hijacking via EM Side Channel

Leo Linqian Gan, Jeffery Wu, Longyuan Ge, Lanqing Yang +5 more

ClawGuard introduces a passive, out-of-band security monitor that detects LLM agent workflow hijacking by analyzing unique electromagnetic (EM) emanations generated during agent skill execution.

View →
cs.CLcs.AIcs.CRRecentApr 6, 2026

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang

XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…

View →
cs.CRcs.AIRecentApr 1, 2026

SelfGrader: LLM Jailbreak Detection via Anchored Token-Level Logits

Zikai Zhang, Rui Hu, Olivera Kotevska, Jiahao Xu

SelfGrader proposes a lightweight, robust guardrail for detecting LLM jailbreaks by formulating the detection problem as a numerical grading task using anchored token-level logits, achieving strong pe…

View →