Isaac David

5 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×5AI×4Logic×1

Frequent co-authors

Arthur Gervais5×

Marco Guarnieri1×

Research Timeline

2026

Towards Optimal Agentic Architectures for Offensive Security Tasks

The paper empirically evaluates various agentic architectures for offensive security tasks, finding that while broader coordination improves coverage, the optimal architecture is non-monotonic and depends heavily on cost, latency, and exploit difficulty.

Alignment Contracts for Agentic Security Systems

The paper introduces alignment contracts, a formal framework for specifying and enforcing behavioral constraints over observable effect traces, ensuring that powerful agentic security systems operate only within defined scopes.

Patch2Vuln: Agentic Reconstruction of Vulnerabilities from Linux Distribution Binary Patches

The paper introduces Patch2Vuln, a pipeline that uses an LLM agent to reconstruct security vulnerabilities by analyzing differences between old and new Linux binary packages, successfully localizing patches in a majority of tested cases.

Ablating Safety: Mechanisms for Removing Alignment in Language Models for Security Applications

The paper proposes Ablating Safety, a controlled protocol for removing safety alignment from language models, demonstrating that targeted de-alignment can significantly boost security performance while maintaining general capability and controlled unsafe compliance.

Measuring Safety Alignment Effects in Autonomous Security Agents

The study evaluates how safety alignment affects autonomous security agents using a comprehensive trace-based benchmark, finding that while less-restricted models show gains, these effects are not universal and require system-level measurement.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentMay 19, 2026

Measuring Safety Alignment Effects in Autonomous Security Agents

Isaac David, Arthur Gervais

View →

cs.CRcs.AIRecentMay 17, 2026