Bug Hunt - Adversarial Bug Finding

Run a sequential-first adversarial bug hunt on your codebase. Use parallelism only for read-only triage and independent verification tasks.

Table of Contents

Usage

Target

Context Budget

Execution Steps

Step 7: Present the Final Report

Self-Test Mode

Error handling

Phase 1 — Find & Verify:

Recon (map) --> Hunter (deep scan) --> Skeptic (challenge) --> Referee (final verdict)

^ (optional read-only dual-lens triage can run here)

|

state + chunk checkpoints

Phase 2 — Fix & Verify (default when bugs are confirmed):

Baseline --> Git branch --> sequential Fixer (single writer) --> targeted verify --> full verify --> report

^ |

+------------------------ checkpoint commits + auto-revert -----+

For small scans (1-10 source files): runs single Hunter + single Skeptic (no parallelism overhead).

For large scans: process chunks sequentially with persistent state to avoid compaction drift.

Usage

/bug-hunter # Scan entire project

/bug-hunter src/ # Scan specific directory

/bug-hunter lib/auth.ts # Scan specific file

/bug-hunter -b feature-xyz # Scan files changed in feature-xyz vs main

/bug-hunter -b feature-xyz --base dev # Scan files changed in feature-xyz vs dev

/bug-hunter --staged # Scan staged files (pre-commit check)

/bug-hunter --scan-only src/ # Scan only, no code changes

/bug-hunter --fix src/ # Find bugs AND auto-fix them

/bug-hunter --autonomous src/ # Alias for no-intervention auto-fix run

/bug-hunter --fix -b feature-xyz # Find + fix on branch diff

/bug-hunter --fix --approve src/ # Find + fix, but ask before each fix

/bug-hunter src/ # Loops by default: audit + fix until all queued source files are covered

/bug-hunter --no-loop src/ # Single-pass only, no iterating

/bug-hunter --no-loop --scan-only src/ # Single-pass scan, no fixes, no loop

/bug-hunter --deps src/ # Include dependency CVE scan

/bug-hunter --threat-model src/ # Generate/use STRIDE threat model

/bug-hunter --deps --threat-model src/ # Full security audit

/bug-hunter --fix --dry-run src/ # Preview fixes without editing files

Target

The raw arguments are: $ARGUMENTS

Parse the arguments as follows:

Default

LOOP_MODE=true

. If arguments contain

--no-loop

strip it from the arguments and set

LOOP_MODE=false

. The

--loop

flag is accepted for backwards compatibility but is a no-op (loop is already the default).

0b. Default

FIX_MODE=true

.

0c. If arguments contain

--scan-only

strip it from the arguments and set

FIX_MODE=false

.

0d. If arguments contain

--fix

strip it from the arguments and set

FIX_MODE=true

. The remaining arguments are parsed normally below.

0e. If arguments contain

--autonomous

strip it from the arguments, set

AUTONOMOUS_MODE=true

, and force

FIX_MODE=true

(canary-first + confidence-gated).

0f. If arguments contain

--approve

strip it from the arguments and set

APPROVE_MODE=true

. When this flag is set, Fixer agents run in

mode: "default"

(user reviews and approves each edit). When not set,

APPROVE_MODE=false

and Fixers run autonomously.

0g. If arguments contain

--deps

strip it and set

DEP_SCAN=true

. Dependency scanning runs package manager audit tools and checks if vulnerable APIs are actually called in the codebase.

0h. If arguments contain

--threat-model

strip it and set

THREAT_MODEL_MODE=true

. Generates a STRIDE threat model at

.bug-hunter/threat-model.md

if one doesn't exist, then feeds it to Recon + Hunter for targeted security analysis.

0i. If arguments contain

--dry-run

strip it and set

DRY_RUN_MODE=true

. Forces

FIX_MODE=true

. In dry-run mode, Phase 2 builds the fix plan and the Fixer reads code and outputs planned changes as unified diff previews, but no file edits, git commits, or lock acquisition occur. Produces

fix-report.json

with

"dry_run": true

.

If arguments contain

--staged

this is

staged file mode

.

Run

git diff --cached --name-only

using the Bash tool to get the list of staged files.

If the command fails, report the error to the user and stop.

If no files are staged, tell the user there are no staged changes to scan and stop.

The scan target is the list of staged files (scan their full contents, not just the diff).

If arguments contain

-b

this is

branch diff mode

.

Extract the branch name after

-b

.

If

--base

is also present, use that as the base branch. Otherwise default to

main

.

Run

git diff --name-only <base>...

using the Bash tool to get the list of changed files.

If the command fails (e.g. branch not found), report the error to the user and stop.

If no files changed, tell the user there are no changes to scan and stop.

The scan target is the list of changed files (scan their full contents, not just the diff).

If arguments do NOT contain

-b

or

--staged

treat the entire argument string as a

path target

(file or directory). If empty, scan the current working directory.

After resolving the file list (for modes 1 and 2), filter out non-source files:

Remove any files matching these patterns — they are not scannable source code:

Docs/text:

*.md

,

*.txt

,

*.rst

,

*.adoc

Config:

*.json

,

*.yaml

,

*.yml

,

*.toml

,

*.ini

,

*.cfg

,

.env*

,

.gitignore

,

.editorconfig

,

.prettierrc*

,

.eslintrc*

,

tsconfig.json

,

jest.config.*

,

vitest.config.*

,

webpack.config.*

,

vite.config.*

,

next.config.*

,

tailwind.config.*

Lockfiles:

*.lock

,

*.sum

Minified/maps:

*.min.js

,

*.min.css

,

*.map

Assets:

*.svg

,

*.png

,

*.jpg

,

*.gif

,

*.ico

,

.woff

,

*.ttf

,

*.eot

Project meta:

LICENSE

,

CHANGELOG*

,

CONTRIBUTING*

,

CODE_OF_CONDUCT*

,

Makefile

,

Dockerfile

,

docker-compose*

,

Procfile

Vendor dirs:

node_modules/

,

vendor/

,

dist/

,

build/

,

.next/

,

pycache/

,

.venv/

If after filtering there are zero source files left, tell the user: "No scannable source files found — only config/docs/assets were changed." and stop.

Context Budget

FILE_BUDGET is computed by the triage script (Step 1), not by Recon.

The triage script samples 30 files from the codebase, computes average line count, and derives:

avg_tokens_per_file = average_lines_per_file * 4

FILE_BUDGET = floor(150000 / avg_tokens_per_file) # capped at 60, floored at 10

Triage also determines the strategy directly, so Step 3 just reads the triage output — no circular dependency.

Then determine partitioning:

Total source files

Strategy

Hunters

Skeptics

1

Single-file mode

1 general

1

2-10

Small mode

1 general

1

11 to FILE_BUDGET

Parallel mode (hybrid)

1 deep Hunter (+ optional 2 read-only triage Hunters)

1-2 by directory

FILE_BUDGET+1 to FILE_BUDGET*2

Extended mode

Sequential chunked Hunters

1-2 by directory

FILE_BUDGET

2+1 to FILE_BUDGET

3

Scaled mode

Sequential chunked Hunters with resume state

1-2 by directory

> FILE_BUDGET*3

Large-codebase mode + Loop

Domain-scoped pipelines + boundary audits

Per-domain 1-2

If triage was not run (e.g., Recon was called directly without the orchestrator), use the default FILE_BUDGET of 40.

File partitioning rules (Extended/Scaled modes):

Service-aware partitioning (preferred)

If Recon detected multiple service boundaries (monorepo), partition by service.

Risk-tier partitioning (fallback)

process CRITICAL then HIGH then MEDIUM then LOW.

Keep chunk size small (recommended 20-40 files) to avoid context compaction issues.

Persist chunk progress in

.bug-hunter/state.json

so restarts do not re-scan done chunks.

Test files (CONTEXT-ONLY) are included only when needed for intent.

If the triage output shows

needsLoop: true

and

LOOP_MODE=false

(user passed

--no-loop

), warn the user: "This codebase has [N] source files (FILE_BUDGET: [B]). Single-pass mode will only cover a subset. Loop mode is recommended for thorough coverage (remove

--no-loop

to enable). Large codebases use domain-scoped auditing — see

modes/large-codebase.md

."

Execution Steps

Step 0: Preflight checks

Before doing anything else, verify the environment:

Resolve skill directory

Determine

SKILL_DIR

dynamically.

Preferred: derive it from the absolute path of the current

SKILL.md

(

dirname

of this file).

Fallback probe order:

$HOME/.agents/skills/bug-hunter

,

$HOME/.claude/skills/bug-hunter

,

$HOME/.codex/skills/bug-hunter

.

Use this path for ALL Read tool calls and shell commands.

Verify skill files exist

Run

ls "$SKILL_DIR/prompts/hunter.md"

via Bash. If this fails, stop and tell the user: "Bug Hunter skill files not found. Reinstall the skill and retry."

Node.js available

Run

node --version

via Bash. If it fails, stop and tell the user: "Node.js is required for doc verification. Please install Node.js to continue."

3b.

Create output directory

:

bash mkdir -p .bug-hunter/payloads .bug-hunter/domains

This directory stores all pipeline artifacts. Add

.bug-hunter/

to your project's

.gitignore

.

Doc lookup availability (optional, non-blocking)

Run a quick smoke test:

node "$SKILL_DIR/scripts/doc-lookup.cjs" search "express" "middleware"

If it returns results, set

DOC_LOOKUP_AVAILABLE=true

.

If it fails, try the fallback:

node "$SKILL_DIR/scripts/context7-api.cjs" search "express" "middleware"

If both fail, warn the user and set

DOC_LOOKUP_AVAILABLE=false

.

Missing

CONTEXT7_API_KEY

must NOT block execution; anonymous lookups may still work.

Verify helper scripts exist

:

ls "$SKILL_DIR/scripts/run-bug-hunter.cjs" "$SKILL_DIR/scripts/bug-hunter-state.cjs" "$SKILL_DIR/scripts/delta-mode.cjs" "$SKILL_DIR/scripts/payload-guard.cjs" "$SKILL_DIR/scripts/fix-lock.cjs" "$SKILL_DIR/scripts/triage.cjs" "$SKILL_DIR/scripts/doc-lookup.cjs"

If any are missing, stop and tell the user to update/reinstall the skill.

Note:

code-index.cjs

is optional — enables cross-domain dependency analysis for boundary audits in large-codebase mode, but the pipeline works fully without it.

Note:

context7-api.cjs

is kept as a fallback —

doc-lookup.cjs

is the primary doc verification script.

Note:

worktree-harvest.cjs

is optional — enables worktree-isolated Fixer dispatch for

subagent

/

teams

backends. Without it, Fixers edit directly on the fix branch (still safe via single-writer lock + auto-revert).

5b.

Check Context Hub CLI (recommended, non-blocking)

:

chub

--help

2

>

/dev/null

&&

chub update

2

>

/dev/null

If

chub

is available, set

CHUB_AVAILABLE=true

. Report:

✓ Context Hub available — using curated docs for verification.

If

chub

is NOT installed, set

CHUB_AVAILABLE=false

.

Warn the user visibly:

⚠️ Context Hub (chub) is not installed. Doc verification will fall back to Context7 API,

which has broader coverage but less curated results.

For better doc verification accuracy, install Context Hub:

npm install -g @aisuite/chub

More info: https://github.com/andrewyng/context-hub

Do NOT block the pipeline — Context7 fallback works, just with less curated results.

Select orchestration backend (cross-CLI portability)

:

Detect which dispatch tools are available in your runtime. Use the FIRST that works:

Option A —

subagent

tool (Pi agent, preferred for parallel):

Test: call

subagent({ action: "list" })

. If it returns without error, this backend works.

Set

AGENT_BACKEND = "subagent"

Dispatch pattern for each phase:

subagent({

agent: "-agent",

task: "",

output: ".bug-hunter/-output.md"

})

Read the output file after the subagent completes.

Option B —

teams

tool (Pi agent teams):

Test: does the

teams

tool exist in your available tools?

Set

AGENT_BACKEND = "teams"

Dispatch pattern:

teams({

tasks: [{ text: "" }],

maxTeammates: 1

})

Option C —

interactive_shell

(Claude Code, Codex, other CLI agents):

Set

AGENT_BACKEND = "interactive_shell"

Dispatch pattern:

interactive_shell({

command: 'pi ""',

mode: "dispatch"

})

Option D —

local-sequential

(default — always works):

Set

AGENT_BACKEND = "local-sequential"

Read

SKILL_DIR/modes/local-sequential.md

for full instructions.

You run all phases (Recon, Hunter, Skeptic, Referee) yourself,

sequentially, within your own context window.

Write phase outputs to

.bug-hunter/

files between phases.

IMPORTANT

:

local-sequential

is NOT a degraded mode. It is the expected

default for most environments and the skill works fully in this mode. Subagent

dispatch is an optimization for large codebases, not a requirement.

Rules:

Use exactly ONE backend for the whole run.

If a remote backend launch fails, fall back to the next option.

If all remote backends fail, use

local-sequential

and continue.

Step 1: Parse arguments, resolve target, and run triage

Follow the rules in the

Target

section above. If in branch diff or staged mode, run the appropriate git command now, collect the file list, and apply the filter.

Report to the user:

Mode (full project / directory / file / branch diff / staged)

Number of source files to scan (after filtering)

Number of files filtered out

Then run triage (zero-token strategy decision):

Run the triage script AFTER resolving the target. This is a pure Node.js filesystem scan — no tokens consumed, runs in <2 seconds even on 2,000+ file repos.

node

"

$SKILL_DIR

/scripts/triage.cjs"

scan

""

--output

.bug-hunter/triage.json

Then read

.bug-hunter/triage.json

. It contains:

strategy

which mode to use ("single-file", "small", "parallel", "extended", "scaled", "large-codebase")

modeFile

which mode file to read

fileBudget

computed from actual file sizes (sampled), not a guess

totalFiles

/

scannableFiles

exact count

domains

directory-level risk classification (CRITICAL/HIGH/MEDIUM/LOW/CONTEXT-ONLY)

riskMap

file-level classification (only present when ≤200 files)

domainFileLists

per-domain file lists (only present for large-codebase strategy)

scanOrder

priority-ordered list for Hunters

tokenEstimate

cost estimates for each pipeline phase

needsLoop

whether loop mode is needed for full coverage (loop is on by default; this indicates
--no-loop
would cause incomplete coverage)
Set these variables from the triage output:
STRATEGY = triage.strategy
FILE_BUDGET = triage.fileBudget
TOTAL_FILES = triage.totalFiles
SCANNABLE_FILES = triage.scannableFiles
NEEDS_LOOP = triage.needsLoop
Report to the user:
Triage: [TOTAL_FILES] source files | FILE_BUDGET: [FILE_BUDGET] | Strategy: [STRATEGY]
Domains: [N] CRITICAL, [N] HIGH, [N] MEDIUM, [N] LOW
Token estimate: ~[N] tokens for full pipeline
If triage says
needsLoop: true
and
LOOP_MODE=false
(user passed
--no-loop
), warn:
⚠️ This codebase has [N] source files (FILE_BUDGET: [B]).
Single-pass mode will only cover a subset. Remove --no-loop to enable iterative coverage.
Proceeding with partial scan — highest-priority queued files only.
Triage replaces Recon's FILE_BUDGET computation.
Recon still runs for tech stack identification and pattern-based analysis, but it no longer needs to count files or compute the context budget — triage already did that, for free.
Step 1b: Generate threat model (if --threat-model)
If
THREAT_MODEL_MODE=true
:
Check if
.bug-hunter/threat-model.md
already exists.
If it exists and was modified within the last 90 days: use it as-is. Set
THREAT_MODEL_AVAILABLE=true
.
If it exists but is >90 days old: warn user ("Threat model is N days old — regenerating"), regenerate.
If it doesn't exist: generate it.
To generate:
Read
$SKILL_DIR/prompts/threat-model.md
.
Dispatch the threat model generation agent (or execute locally if local-sequential).
Input: triage.json (if available) for file structure, or Glob-based discovery.
Wait for
.bug-hunter/threat-model.md
to be written.
Set
THREAT_MODEL_AVAILABLE=true
.
If
THREAT_MODEL_MODE=false
but
.bug-hunter/threat-model.md
exists:
Load it anyway — free context. Set
THREAT_MODEL_AVAILABLE=true
.
Report: "Existing threat model found — loading for enhanced security analysis."
Step 1c: Dependency scan (if --deps)
If
DEP_SCAN=true
:
node
"
$SKILL_DIR
/scripts/dep-scan.cjs"
--target
""
--output
.bug-hunter/dep-findings.json
Report to user:
Dependencies: [N] HIGH/CRITICAL CVEs found | [R] reachable, [P] potentially reachable, [U] not reachable
If
.bug-hunter/dep-findings.json
exists with REACHABLE findings, include them in Hunter context as "Known Vulnerable Dependencies" — Hunter should verify if vulnerable APIs are called in scanned source files.
Step 2: Read prompt files on demand (context efficiency)
MANDATORY: You MUST read prompt files using the Read tool before passing them to subagents or executing them yourself. Do NOT skip this or act from memory. Use the absolute SKILL_DIR path resolved in Step 0. Load only what you need for each phase — do NOT read all files upfront: Phase Read These Files Threat Model (Step 1b) prompts/threat-model.md (only if THREAT_MODEL_MODE=true) Recon (Step 4) prompts/recon.md (skip for single-file mode) Hunters (Step 5) prompts/hunter.md + prompts/doc-lookup.md + prompts/examples/hunter-examples.md Skeptics (Step 6) prompts/skeptic.md + prompts/doc-lookup.md + prompts/examples/skeptic-examples.md Referee (Step 7) prompts/referee.md Fixers (Phase 2) prompts/fixer.md + prompts/doc-lookup.md (only if FIX_MODE=true) Concrete examples for each backend: Example A: local-sequential (most common)

Phase B — launching Hunter yourself

1. Read the prompt file:

read({ path: "$SKILL_DIR/prompts/hunter.md" })

2. You now have the Hunter's full instructions. Execute them yourself:

- Read each file in risk-map order using the Read tool

- Apply the security checklist sweep

- Write each finding in BUG-N format

3. Write your canonical findings artifact to disk:

write({ path: ".bug-hunter/findings.json", content: "" }) Example B: subagent dispatch

Phase B — launching Hunter via subagent

1. Read the prompt:

read({ path: "$SKILL_DIR/prompts/hunter.md" })

2. Read the wrapper template:

read({ path: "$SKILL_DIR/templates/subagent-wrapper.md" })

3. Fill the template with:

- {ROLE_NAME} = "hunter"

- {ROLE_DESCRIPTION} = "Bug Hunter — find behavioral bugs in source code"

- {PROMPT_CONTENT} =

- {TARGET_DESCRIPTION} = "FindCoffee monorepo backend services"

- {FILE_LIST} =

- {RISK_MAP} =

- {TECH_STACK} =

- {PHASE_SPECIFIC_CONTEXT} =

- {OUTPUT_FILE_PATH} = ".bug-hunter/findings.json"

- {SKILL_DIR} =

4. Dispatch:

subagent({ agent: "hunter-agent", task: "", output: ".bug-hunter/findings.json" })

5. Read the output:

read({ path: ".bug-hunter/findings.json" })

When launching subagents, always pass

SKILL_DIR

explicitly in the task context so prompt commands like

node "$SKILL_DIR/scripts/doc-lookup.cjs"

resolve correctly. The

context7-api.cjs

script is kept as a fallback if

doc-lookup.cjs

fails.

Before every subagent launch, validate payload shape with:

node "$SKILL_DIR/scripts/payload-guard.cjs" validate "" ""

If validation fails, do NOT launch the subagent. Fix the payload first.

Any mode step that says "launch subagent" means "dispatch an agent task using

AGENT_BACKEND

". For

local-sequential

, "launch" means "execute that phase's instructions yourself."

After reading each prompt, extract the key instructions and pass the content to subagents via their system prompts. You do not need to keep the full text in working memory.

Context pruning for subagents:

When passing bug lists to Skeptics, Fixers, or the Referee, only include the bugs assigned to that agent — not the full merged list. For each bug, include: BUG-ID, severity, file, lines, claim, evidence, runtime trigger, cross-references. Omit: the Hunter's internal reasoning, scan coverage stats, and any "FILES SCANNED/SKIPPED" metadata. This keeps subagent prompts lean.

Step 3: Determine execution mode

Use the triage output from Step 1

— the strategy and FILE_BUDGET are already computed. Do NOT wait for Recon to determine the mode.

Read the corresponding mode file using

STRATEGY

from the triage JSON:

single-file

:

SKILL_DIR/modes/single-file.md

small

:

SKILL_DIR/modes/small.md

parallel

:

SKILL_DIR/modes/parallel.md

extended

:

SKILL_DIR/modes/extended.md

scaled

:

SKILL_DIR/modes/scaled.md

large-codebase

force

LOOP_MODE=true

and read

SKILL_DIR/modes/large-codebase.md

then

SKILL_DIR/modes/loop.md

Backend override for local-sequential:

If

AGENT_BACKEND = "local-sequential"

, read

SKILL_DIR/modes/local-sequential.md

instead of the size-based mode file. The local-sequential mode handles all sizes internally with its own chunking logic.

If LOOP_MODE=true, also read:

SKILL_DIR/modes/fix-loop.md

when FIX_MODE=true

SKILL_DIR/modes/loop.md

otherwise

CRITICAL — ralph-loop integration:

When

LOOP_MODE=true

, you MUST call the

ralph_start

tool before running the first pipeline iteration. The loop mode files (

loop.md

/

fix-loop.md

) contain the exact

ralph_start

call to make, including the

taskContent

and

maxIterations

parameters. Without calling

ralph_start

, the loop will NOT iterate — it will run once and stop. After each iteration, call

ralph_done

to continue, or output

COMPLETE

when done.

Report the chosen mode to the user.

Then follow the steps in the loaded mode file.

Each mode file contains the specific steps for running Recon, Hunters, Skeptics, and Referee for that mode. Each mode also references

modes/_dispatch.md

for backend-specific dispatch patterns. Execute them in order.

Branch-diff and staged optimization:

For

-b

and

--staged

modes, if the file count ≤ FILE_BUDGET, always use

small

or

parallel

mode regardless of total codebase size. The triage script already handles this since it only scans the provided target files.

For

extended

and

scaled

modes, initialize state before chunk execution:

node "$SKILL_DIR/scripts/bug-hunter-state.cjs" init ".bug-hunter/state.json" "" "" 30

Then apply hash-based skip filtering before each chunk:

node "$SKILL_DIR/scripts/bug-hunter-state.cjs" hash-filter ".bug-hunter/state.json" ""

For full autonomous chunk orchestration with timeouts, retries, and journaling, extended/scaled modes can use:

node "$SKILL_DIR/scripts/run-bug-hunter.cjs" run --skill-dir "$SKILL_DIR" --files-json "" --mode ""

See

run-bug-hunter.cjs --help

for all options (delta-mode, canary-size, expand-on-low-confidence, etc.).

Step 7: Present the Final Report

After the mode-specific steps complete, display the final report:

1. Scan metadata

Mode (single-file / small / parallel-hybrid / extended / scaled / loop)

Files scanned: N source files (N filtered out)

Architecture: [summary from Recon]

Tech stack: [framework, auth, DB from Recon]

2. Pipeline summary

Triage: [N] source files | FILE_BUDGET: [B] | Strategy: [STRATEGY]

Recon: mapped N files -> CRITICAL: X | HIGH: Y | MEDIUM: Z | Tests: T

Hunters: [deep scan findings: W | optional triage findings: T | merged: U unique]

Gap-fill: [N files re-scanned, M additional findings] (or "not needed")

Skeptics: [challenged X | disproved: D, accepted: A]

Referee: confirmed N real bugs -> Critical: X | Medium: Y | Low: Z

3. Confirmed bugs table

(sorted by severity — from Referee output)

4. Low-confidence items

Flagged for manual review.

Include an

Auto-fix eligibility

field per bug:

ELIGIBLE

Referee confidence >= 75%
MANUAL_REVIEW: confidence < 75% or missing confidence If low-confidence items exist, expand scan scope from delta mode using trust-boundary overlays before finalizing report. 5. Dismissed findings In a collapsed

section (for transparency). 6. Agent accuracy stats Deep Hunter accuracy: X/Y confirmed (Z%) Optional triage value: N triage-only findings promoted to deep scan Skeptic accuracy: X/Y correct challenges (Z%) 7. Coverage assessment If ALL queued scannable source files scanned: "Full queued coverage achieved." If any missed: list them with note about --loop mode. 7b. Coverage enforcement (mandatory) If the coverage assessment shows ANY queued scannable source files were not scanned, the pipeline is NOT complete: If LOOP_MODE=true (default): the ralph-loop will automatically continue to the next iteration covering missed files. Call ralph_done to proceed to the next iteration. Do NOT output COMPLETE until all queued scannable source files show DONE. If LOOP_MODE=false ( --no-loop was specified) AND missed files exist: If total files ≤ FILE_BUDGET × 3: Output the report with a WARNING: ⚠️ PARTIAL COVERAGE: [N] queued source files were not scanned. Run `/bug-hunter [path]` for complete coverage (loop is on by default). Unscanned files: [list them] If total files > FILE_BUDGET × 3: The report MUST include: 🚨 LARGE CODEBASE: [N] source files (FILE_BUDGET: [B]). Single-pass audit covered [X]% of queued source files. Use `/bug-hunter [path]` for full coverage (loop is on by default). Do NOT claim "audit complete" or "full coverage achieved" unless ALL queued scannable source files have status DONE. A partial audit is still valuable — report what you found honestly. Autonomous runs must keep descending through the remaining priority queue after the current prioritized chunk is done: Finish current CRITICAL/HIGH work first. Immediately continue with remaining MEDIUM files. Then continue with remaining LOW files. Only stop when the queue is exhausted, the user interrupts, or a hard blocker prevents safe progress. If zero bugs were confirmed, say so clearly — a clean report is a good result. Routing after report: If confirmed bugs > 0 AND FIX_MODE=true : Auto-fix only ELIGIBLE bugs. Apply canary-first rollout: fix top critical eligible subset first, verify, then continue remaining eligible fixes. Keep MANUAL_REVIEW bugs in report only (do not auto-edit). Run final global consistency pass over merged findings before applying fixes. Read SKILL_DIR/modes/fix-pipeline.md and execute Phase 2 on eligible subset. If confirmed bugs > 0 AND FIX_MODE=false : stop after report (scan-only mode). If zero bugs confirmed: stop here. The report is the final output. 8. JSON output (always generated) After the markdown report, write a machine-readable findings file to .bug-hunter/findings.json : { "version" : "3.0.0" , "scan_id" : "scan-YYYY-MM-DD-HHmmss" , "scan_date" : "" , "mode" : "" , "target" : "" , "files_scanned" : 0 , "threat_model_loaded" : false , "confirmed" : [ { "id" : "BUG-1" , "severity" : "CRITICAL" , "category" : "security" , "stride" : "Tampering" , "cwe" : "CWE-89" , "file" : "src/api/users.ts" , "lines" : "45-49" , "claim" : "SQL injection via unsanitized query parameter" , "reachability" : "EXTERNAL" , "exploitability" : "EASY" , "cvss_vector" : "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N" , "cvss_score" : 9.1 , "poc" : { "payload" : "..." , "request" : "..." , "expected" : "..." , "actual" : "..." } } ] , "dismissed" : [ { "id" : "BUG-3" , "severity" : "Medium" , "category" : "logic" , "file" : "..." , "claim" : "..." , "reason" : "..." } ] , "dependencies" : [ ] , "summary" : { "total_reported" : 0 , "confirmed" : 0 , "dismissed" : 0 , "by_severity" : { "CRITICAL" : 0 , "HIGH" : 0 , "MEDIUM" : 0 , "LOW" : 0 } , "by_stride" : { "Tampering" : 0 , "InfoDisclosure" : 0 , "ElevationOfPrivilege" : 0 , "Spoofing" : 0 , "DoS" : 0 , "Repudiation" : 0 , "N/A" : 0 } , "by_category" : { "security" : 0 , "logic" : 0 , "error-handling" : 0 } } } Rules for JSON output: Non-security findings: stride: "N/A" , cwe: "N/A" , omit reachability/CVSS/PoC fields. Security findings without CRITICAL/HIGH severity: omit CVSS and PoC fields. dependencies array: populated only if --deps was used and .bug-hunter/dep-findings.json exists. This JSON enables CI/CD gating, dashboard ingestion, and downstream patch generation. Also write the final markdown report to .bug-hunter/report.md as the canonical human-readable output. Generate it from the JSON artifacts with: node " $SKILL_DIR /scripts/render-report.cjs" report ".bug-hunter/findings.json" ".bug-hunter/referee.json" > ".bug-hunter/report.md" Self-Test Mode To validate the pipeline works end-to-end, run /bug-hunter SKILL_DIR/test-fixture/ on the included test fixture. This directory contains a small Express app with 6 intentionally planted bugs (2 Critical, 3 Medium, 1 Low). Expected results: Recon should classify 3 files as CRITICAL, 1 as HIGH Hunters should find all 6 bugs (possibly more false positives) Skeptic should challenge at least 1 false positive Referee should confirm all 6 planted bugs If the pipeline finds fewer than 5 of the 6 planted bugs, the prompts need tuning. If it reports more than 3 false positives that survive to the Referee, the Skeptic prompt needs tightening. The test fixture source files ship with the skill. If using --fix mode on the fixture, initialize its git repo first: bash SKILL_DIR/scripts/init-test-fixture.sh Error handling Step Failure Fallback Triage script error Skip triage, Recon does full classification with FILE_BUDGET=40 default Recon timeout/error Skip Recon, Hunters use triage scanOrder (or Glob-based discovery if no triage) Optional scout pass timeout/error Disable scout, continue with deep Hunter Deep Hunter timeout/error Retry once on narrowed chunk, otherwise report partial coverage Orchestration backend launch failure Fall back to next backend (subagent → teams → interactive_shell → local-sequential) Gap-fill Hunter timeout/error Note missed files, continue Payload guard validation fails Do not launch subagent; fix payload and retry Chunk orchestrator timeout/error Retry with exponential backoff, then mark chunk failed Skeptic timeout/error Use single Skeptic or accept all findings as-is Referee timeout/error Use Skeptic's accepted list as final result Git safety (Step 8a) not a git repo Warn user, skip branching Git safety (Step 8a) stash/branch fails Warn, continue without safety net Fix lock lock held Stop Phase 2, report concurrent fixer run Test baseline (Step 8c) timeout/not found Set BASELINE=null, skip test verification Fixer timeout/error Mark unfixed bugs as SKIPPED Post-fix tests new failures Auto-revert failed fix commit, mark FIX_REVERTED Post-fix re-scan timeout/error Skip re-scan, note "fixer output not re-verified" Worktree prepare git worktree add fails Fall back to WORKTREE_MODE=false (direct edit mode) for this run Worktree harvest no commits found, dirty Stash uncommitted work, mark bugs as FIX_FAILED (reason: fixer-did-not-commit) Worktree harvest branch switched Mark all bugs in batch as FIX_FAILED (reason: branch-switched) Worktree cleanup git worktree remove fails Force-remove directory, run git worktree prune Stale worktrees from previous crash cleanup-all at Step 8a-wt removes them before starting Fix lock release release fails Warn user to clear .bug-hunter/fix.lock manually

安装

Phase B — launching Hunter yourself

1. Read the prompt file:

2. You now have the Hunter's full instructions. Execute them yourself:

- Read each file in risk-map order using the Read tool

- Apply the security checklist sweep

- Write each finding in BUG-N format

3. Write your canonical findings artifact to disk:

Phase B — launching Hunter via subagent

1. Read the prompt:

2. Read the wrapper template:

3. Fill the template with:

- {ROLE_NAME} = "hunter"

- {ROLE_DESCRIPTION} = "Bug Hunter — find behavioral bugs in source code"

- {PROMPT_CONTENT} =

- {TARGET_DESCRIPTION} = "FindCoffee monorepo backend services"

- {FILE_LIST} =

- {RISK_MAP} =

- {TECH_STACK} =

- {PHASE_SPECIFIC_CONTEXT} =

- {OUTPUT_FILE_PATH} = ".bug-hunter/findings.json"

- {SKILL_DIR} =

4. Dispatch:

5. Read the output: