Anthropic releases open-source framework for AI vulnerability discover

Anthropic has published an open-source reference implementation that uses its Claude AI model to find and fix security vulnerabilities in source code. The project, called Defending Code Reference Harness, is based on what the company learned from working with security teams at several organizations since launching Claude Mythos Preview.

The repository is not maintained and does not accept contributions. It serves as a starting point for developers who want to build their own vulnerability discovery pipeline using Claude. Users can customize the logic and use it with any access they have to Claude APIs, including Bedrock, Vertex, or Azure.

What the repository includes

The open-source code includes several Claude Code skills: /quickstart, /threat-model, /vuln-scan, /triage, /patch, and /customize. These skills allow interactive scoping, scanning, triage, and patching. Users can open the repository in Claude Code and run /quickstart to get oriented.

The harness folder contains the autonomous reference pipeline that goes through the steps of recon, find, verify, report, and patch. It is configured for finding memory vulnerabilities in C and C++ code using Docker and ASAN. The harness is a reference, not a product, so it will not work on every codebase out of the box without customization.

Security and sandboxing

The skills /quickstart, /threat-model, /vuln-scan, and /triage only read and write files. Running /patch on static findings is also read-and write-only. /customize edits the harness code and runs validation commands. These skills are safe to run unsandboxed as long as users review and approve each tool use in Claude Code. However, the autonomous reference pipeline executes target code and refuses to run outside of a gVisor sandbox unless explicitly overridden. Users must run scripts/setup_sandbox.sh once and then invoke the pipeline via bin/vp-sandboxed.

Getting started

To begin, clone the repository, navigate into it, and run the claude command. Then type /quickstart for a 30-second introduction and guided first run on the canary target. Users can ask questions like how to port the pipeline to Java or how to triage bugs.

The repository includes further reading materials: a blog post with learnings and best practices, documentation on how the pipeline works, security guidelines, agent sandbox details, customization instructions, patching guidance, troubleshooting tips, and safeguards for blocking dangerous cyber work.

Ramp up steps

The documentation describes a four-step ramp up plan for security teams. The most successful teams get hands-on quickly rather than spending months designing the perfect pipeline.

Step 1: Day 1 - Build a threat model and run first static scan

On day one, users build a threat model and run a static scan scoped by it. They triage the results and draft candidate fixes. By the end of the day they have a threat model, a ranked list of static findings, and candidate patches. This step only reads and writes files and does not require a sandbox if running interactively.

The #1 Newsletter in AI

Stay ahead of the AI curve

The most important updates, news, and content — delivered weekly.

No spam. Unsubscribe anytime.

Commands include exporting a subagent model, launching Claude, running /quickstart, /threat-model bootstrap, /vuln-scan, /triage, and /patch. The output includes THREAT_MODEL.md, VULN-FINDINGS files, TRIAGE files, and a PATCHES directory.

Step 2: Day 2 - Run the reference pipeline on a C/C++ library

On day two, users move to an autonomous run using the reference pipeline. They run the full recon-find-verify-report loop on a known-vulnerable open-source library and generate a candidate patch. The run requires one-time setup: creating a Python virtual environment, installing dependencies, setting up the gVisor sandbox, and setting the API key or OAuth token.

The command to run the pipeline is bin/vp-sandboxed run drlibs with parameters for model, runs, parallel, stream, and auto-focus. Then bin/vp-sandboxed patch generates fixes. Results go into a results/drlibs/timestamp directory. With the , stream flag, the first report appears in minutes.

The pipeline has seven stages: Build, Recon, Find, Verify, Dedupe, Report, and Patch. Build compiles the target into a Docker image with ASAN. Recon partitions the code into attack surfaces. Find runs multiple agents in parallel to craft inputs and crash the binary. Verify reproduces crashes in a fresh container. Dedupe compares crashes against known bugs. Report writes exploitability analysis. Patch generates and validates fixes.

Step 3: Days 3-5 - Customize the pipeline

On days three through five, users customize the harness for their own target. They first point the interactive skills at their own code, then use /customize to port the pipeline to their stack. By the end of the week they have a targets/my-service directory ready for a smoke run.

The reference pipeline is designed for C/C++ memory bugs but its structure is generic. Porting requires adapting three things: what signals a finding, what a proof of concept looks like, and how the target is built and run.

Step 4: Week 2 - Autonomous scanning, triage, and patching

In week two, users run the customized pipeline on their own targets in an outer loop. They scan multiple runs, triage findings across runs, patch based on prioritization, and repeat. The triage skill collapses duplicates across runs and recalibrates severity against the threat model. Patching helps prevent re-finding the same bugs and surfaces deeper issues.

The documentation notes that autonomous triage and patching are still open issues. The verification strategies in /patch raise the bar, but severity and prioritization are environmental judgments. Many partners report these steps as bottlenecks requiring real engineering time.

Looking forward

After initial ramp up, teams tend to invest in reviewing all internal repos and key open-source dependencies, setting up bespoke scanning infrastructure to move off laptops, incorporating scans into the SDLC with recurring or CI-based scans, and experimenting with models to find what works best.

Related on Neura Market:

anthropic claude vulnerability discovery open-source security

Anthropic releases open-source framework for AI vulnerability discovery

What the repository includes

Security and sandboxing

Getting started

Ramp up steps

Step 1: Day 1 - Build a threat model and run first static scan

Stay ahead of the AI curve

Step 2: Day 2 - Run the reference pipeline on a C/C++ library

Step 3: Days 3-5 - Customize the pipeline

Step 4: Week 2 - Autonomous scanning, triage, and patching

Looking forward

More from Neura News

AI Root Cause Analysis Focus Shifts to Context Engineering

Anthropic Launches Opus 5, OpenAI Models Hack Hugging Face

Coding Agent Bills Are Soaring. Here's How to Control Them

LangChain and NVIDIA Launch NemoClaw Deep Agents Blueprint