Lu Yan
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces SafeAudit, a meta-audit framework that systematically enumerates test cases and uses a quantitative metric to uncover significant residual unsafe behaviors in LLM agents that existing benchmarks miss.
The paper introduces WIRE, a pipeline for diagnosing live intra-policy rule conflicts in LLM agents by identifying and testing specific rule pairs within a single prompt policy that can co-govern a realistic state.
Papers
Diagnosing Live Within-Policy Instruction Conflicts in LLM Agents with Witnessed Resolution Profiles
The paper introduces WIRE, a pipeline for diagnosing live intra-policy rule conflicts in LLM agents by identifying and testing specific rule pairs within a single prompt policy that can co-govern a re…