Zehua Zhang
2 indexed papers
Research Timeline
The paper proposes a mutagenic incentive intervention approach that mitigates collusion in embodied multi-agent systems by reshaping agents' payoff structures, effectively inducing defection and maintaining system efficiency.
The paper introduces SKILLSCOPE, a system that detects security-relevant behaviors in code-backed LLM skills that are not disclosed in the natural language description, finding that 9.4% of skills exhibit such inconsistencies.
Papers
Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills
The paper introduces SKILLSCOPE, a system that detects security-relevant behaviors in code-backed LLM skills that are not disclosed in the natural language description, finding that 9.4% of skills exh…