Bryan Hooi
7 indexed papers
Research Timeline
The paper introduces WebAgentGuard, a novel reasoning-driven, multimodal guard model that effectively detects prompt injection attacks in vulnerable web agents without compromising their functionality.
The paper proposes WARD, a robust and efficient defense model that secures web agents against prompt injection attacks embedded in web content, achieving high recall and low false positives even against adaptive attacks.
The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores.
AliMark proposes a novel watermarking framework that treats sentence-level watermarking as a bit sequence alignment problem, significantly enhancing robustness against structural text perturbations like sentence splitting and merging.
AliMark proposes a novel framework that enhances the robustness of sentence-level watermarking by reformulating the problem as a bit sequence encoding and alignment task, significantly improving resilience against structural text perturbations like sentence splitting and merging.
FineVerify introduces a fine-grained self-verification framework that improves agentic search by decomposing complex questions into verifiable sub-questions, leading to significant accuracy gains over standard scaling methods.
The paper introduces EvoNote, a self-evolving agentic framework that significantly improves the generation of evidence-grounded health community notes by utilizing an accumulated memory of past misinformation correction experiences.
Papers
Better with Experience: Self-Evolving LLM Agents for Evidence-Grounded Health Community Notes
Zihang Fu, Fanxiao Li, Jianyang Gu, Haonan Wang +4 more
The paper introduces EvoNote, a self-evolving agentic framework that significantly improves the generation of evidence-grounded health community notes by utilizing an accumulated memory of past misinf…