Jinyuan Jia
6 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces EnsembleSHAP, a novel, computationally efficient, and provably robust feature attribution method specifically designed for the Random Subspace Method to provide secure explanations.
AgentWatcher is a novel, rule-based monitor designed to detect prompt injection attacks in LLM agents by focusing detection on causally influential context segments, thereby improving scalability and explainability.
The paper introduces TRUSTDESC, a novel framework that prevents tool poisoning attacks in LLM applications by automatically generating highly accurate and trusted tool descriptions directly from the tool's source code and behavior.
The paper introduces PIArena, a unified and extensible platform designed to address the lack of standardized evaluation for prompt injection, revealing critical limitations in current state-of-the-art defenses.
The paper introduces FlashRT, a novel framework that significantly improves the computational and memory efficiency of optimization-based red-teaming attacks against long-context LLMs, enabling systematic security evaluation at scale.
CleanBase is a method that detects malicious documents in RAG knowledge databases by identifying clusters (cliques) of documents that exhibit unusually high semantic similarity.
Papers
CleanBase: Detecting Malicious Documents in RAG Knowledge Databases
Weifei Jin, Xilong Wang, Wei Zou, Jinyuan Jia +1 more
CleanBase is a method that detects malicious documents in RAG knowledge databases by identifying clusters (cliques) of documents that exhibit unusually high semantic similarity.