Kejiang Chen
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces Jargon, a novel adversarial framework that exploits the vulnerability of LLMs to context-specific safety boundary blurring, achieving high attack success rates across multiple frontier models.
The paper introduces ReTokSync, a self-synchronizing framework that resolves tokenization ambiguity in Generative Linguistic Steganography (GLS) by correcting mismatches only when they occur, thereby maintaining high security and extraction accuracy.
The paper introduces a formal, logically constrained framework, ePCA, to secure advanced AI agents by forcing them to translate natural language intentions into first-order logical constraints before execution, achieving provably secure performance.
The paper introduces an executable Proof-Constrained Action (ePCA) framework that secures AI agents by forcing them to formalize their intentions into first-order logical constraints, achieving provably secure operation.
Papers
Provably Secure Agent Guardrail
Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more
The paper introduces a formal, logically constrained framework, ePCA, to secure advanced AI agents by forcing them to translate natural language intentions into first-order logical constraints before…