Dawn Song
11 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces a contextual security framework for LLM agents, defining security properties and reformulating various attacks and defenses based on the context of execution.
The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generation without requiring explicit security prompts at inference.
The paper introduces ExploitGym, a large-scale benchmark, demonstrating that advanced AI agents can successfully turn theoretical software vulnerabilities into working exploits, highlighting growing cybersecurity risks.
The paper introduces BenchJack, an automated red-teaming system that systematically audits popular AI agent benchmarks, revealing numerous reward-hacking exploits and demonstrating a method to significantly improve benchmark robustness.
The paper proposes defining 'intent-to-execution integrity' as the necessary end-to-end correctness property for securing LLM agents, arguing that current defenses are insufficient due to untrusted components.
The paper introduces SCDBench, a comprehensive benchmark dataset and methodology that rigorously evaluates LLM-based smart contract decompilers, finding that while frontier LLMs can generate compilable code, achieving full semantic consistency remains a significant challenge.
This study provides the first large-scale measurement of prompt injection attacks in real-world LLM-based resume screening, finding that approximately 1% of resumes contain hidden injections.
The paper introduces SCDBench, a comprehensive benchmark dataset and methodology that rigorously evaluates LLM-based smart contract decompilers, finding that while frontier models can produce compilable code, achieving full semantic consistency remains a significant challenge.
This study provides the first systematic measurement of prompt injection attacks in a real-world LLM-based resume screening application, finding that approximately 1% of resumes contain hidden injections.
BenchEvolver introduces a solution-centric evolutionary framework to automatically transform saturated coding benchmarks into significantly harder, high-quality, and diverse evaluation suites.
The paper introduces CyberGym-E2E, a large-scale, end-to-end benchmark designed to comprehensively evaluate AI agents' capabilities across the entire lifecycle of real-world software vulnerability discovery, proof-of-concept generation, and patch creation.
Papers
CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities
Tianneng Shi, Robin Rheem, Dongwei Jiang, Mona Wang +12 more
The paper introduces CyberGym-E2E, a large-scale, end-to-end benchmark designed to comprehensively evaluate AI agents' capabilities across the entire lifecycle of real-world software vulnerability dis…