bug-hunter

安装量: 34
排名: #19932

安装

npx skills add https://github.com/codexstar69/bug-hunter --skill bug-hunter
Bug Hunt - Adversarial Bug Finding
Run a sequential-first adversarial bug hunt on your codebase. Use parallelism only for read-only triage and independent verification tasks.
Table of Contents
Usage
Target
Context Budget
Execution Steps
Step 7: Present the Final Report
Self-Test Mode
Error handling
Phase 1 — Find & Verify:
Recon (map) --> Hunter (deep scan) --> Skeptic (challenge) --> Referee (final verdict)
^ (optional read-only dual-lens triage can run here)
|
state + chunk checkpoints
Phase 2 — Fix & Verify (default when bugs are confirmed):
Baseline --> Git branch --> sequential Fixer (single writer) --> targeted verify --> full verify --> report
^ |
+------------------------ checkpoint commits + auto-revert -----+
For small scans (1-10 source files): runs single Hunter + single Skeptic (no parallelism overhead).
For large scans: process chunks sequentially with persistent state to avoid compaction drift.
Usage
/bug-hunter # Scan entire project
/bug-hunter src/ # Scan specific directory
/bug-hunter lib/auth.ts # Scan specific file
/bug-hunter -b feature-xyz # Scan files changed in feature-xyz vs main
/bug-hunter -b feature-xyz --base dev # Scan files changed in feature-xyz vs dev
/bug-hunter --staged # Scan staged files (pre-commit check)
/bug-hunter --scan-only src/ # Scan only, no code changes
/bug-hunter --fix src/ # Find bugs AND auto-fix them
/bug-hunter --autonomous src/ # Alias for no-intervention auto-fix run
/bug-hunter --fix -b feature-xyz # Find + fix on branch diff
/bug-hunter --fix --approve src/ # Find + fix, but ask before each fix
/bug-hunter src/ # Loops by default: audit + fix until all queued source files are covered
/bug-hunter --no-loop src/ # Single-pass only, no iterating
/bug-hunter --no-loop --scan-only src/ # Single-pass scan, no fixes, no loop
/bug-hunter --deps src/ # Include dependency CVE scan
/bug-hunter --threat-model src/ # Generate/use STRIDE threat model
/bug-hunter --deps --threat-model src/ # Full security audit
/bug-hunter --fix --dry-run src/ # Preview fixes without editing files
Target
The raw arguments are: $ARGUMENTS
Parse the arguments as follows:
Default
LOOP_MODE=true
. If arguments contain
--no-loop
strip it from the arguments and set
LOOP_MODE=false
. The
--loop
flag is accepted for backwards compatibility but is a no-op (loop is already the default).
0b. Default
FIX_MODE=true
.
0c. If arguments contain
--scan-only
strip it from the arguments and set
FIX_MODE=false
.
0d. If arguments contain
--fix
strip it from the arguments and set
FIX_MODE=true
. The remaining arguments are parsed normally below.
0e. If arguments contain
--autonomous
strip it from the arguments, set
AUTONOMOUS_MODE=true
, and force
FIX_MODE=true
(canary-first + confidence-gated).
0f. If arguments contain
--approve
strip it from the arguments and set
APPROVE_MODE=true
. When this flag is set, Fixer agents run in
mode: "default"
(user reviews and approves each edit). When not set,
APPROVE_MODE=false
and Fixers run autonomously.
0g. If arguments contain
--deps
strip it and set
DEP_SCAN=true
. Dependency scanning runs package manager audit tools and checks if vulnerable APIs are actually called in the codebase.
0h. If arguments contain
--threat-model
strip it and set
THREAT_MODEL_MODE=true
. Generates a STRIDE threat model at
.bug-hunter/threat-model.md
if one doesn't exist, then feeds it to Recon + Hunter for targeted security analysis.
0i. If arguments contain
--dry-run
strip it and set
DRY_RUN_MODE=true
. Forces
FIX_MODE=true
. In dry-run mode, Phase 2 builds the fix plan and the Fixer reads code and outputs planned changes as unified diff previews, but no file edits, git commits, or lock acquisition occur. Produces
fix-report.json
with
"dry_run": true
.
If arguments contain
--staged
this is
staged file mode
.
Run
git diff --cached --name-only
using the Bash tool to get the list of staged files.
If the command fails, report the error to the user and stop.
If no files are staged, tell the user there are no staged changes to scan and stop.
The scan target is the list of staged files (scan their full contents, not just the diff).
If arguments contain
-b
this is
branch diff mode
.
Extract the branch name after
-b
.
If
--base
is also present, use that as the base branch. Otherwise default to
main
.
Run
git diff --name-only ...
using the Bash tool to get the list of changed files.
If the command fails (e.g. branch not found), report the error to the user and stop.
If no files changed, tell the user there are no changes to scan and stop.
The scan target is the list of changed files (scan their full contents, not just the diff).
If arguments do NOT contain
-b
or
--staged
treat the entire argument string as a
path target
(file or directory). If empty, scan the current working directory.
After resolving the file list (for modes 1 and 2), filter out non-source files:
Remove any files matching these patterns — they are not scannable source code:
Docs/text:
*.md
,
*.txt
,
*.rst
,
*.adoc
Config:
*.json
,
*.yaml
,
*.yml
,
*.toml
,
*.ini
,
*.cfg
,
.env*
,
.gitignore
,
.editorconfig
,
.prettierrc*
,
.eslintrc*
,
tsconfig.json
,
jest.config.*
,
vitest.config.*
,
webpack.config.*
,
vite.config.*
,
next.config.*
,
tailwind.config.*
Lockfiles:
*.lock
,
*.sum
Minified/maps:
*.min.js
,
*.min.css
,
*.map
Assets:
*.svg
,
*.png
,
*.jpg
,
*.gif
,
*.ico
,
.woff
,
*.ttf
,
*.eot
Project meta:
LICENSE
,
CHANGELOG*
,
CONTRIBUTING*
,
CODE_OF_CONDUCT*
,
Makefile
,
Dockerfile
,
docker-compose*
,
Procfile
Vendor dirs:
node_modules/
,
vendor/
,
dist/
,
build/
,
.next/
,
pycache/
,
.venv/
If after filtering there are zero source files left, tell the user: "No scannable source files found — only config/docs/assets were changed." and stop.
Context Budget
FILE_BUDGET is computed by the triage script (Step 1), not by Recon.
The triage script samples 30 files from the codebase, computes average line count, and derives:
avg_tokens_per_file = average_lines_per_file * 4
FILE_BUDGET = floor(150000 / avg_tokens_per_file) # capped at 60, floored at 10
Triage also determines the strategy directly, so Step 3 just reads the triage output — no circular dependency.
Then determine partitioning:
Total source files
Strategy
Hunters
Skeptics
1
Single-file mode
1 general
1
2-10
Small mode
1 general
1
11 to FILE_BUDGET
Parallel mode (hybrid)
1 deep Hunter (+ optional 2 read-only triage Hunters)
1-2 by directory
FILE_BUDGET+1 to FILE_BUDGET*2
Extended mode
Sequential chunked Hunters
1-2 by directory
FILE_BUDGET
2+1 to FILE_BUDGET
3
Scaled mode
Sequential chunked Hunters with resume state
1-2 by directory
> FILE_BUDGET*3
Large-codebase mode + Loop
Domain-scoped pipelines + boundary audits
Per-domain 1-2
If triage was not run (e.g., Recon was called directly without the orchestrator), use the default FILE_BUDGET of 40.
File partitioning rules (Extended/Scaled modes):
Service-aware partitioning (preferred)
If Recon detected multiple service boundaries (monorepo), partition by service.
Risk-tier partitioning (fallback)
process CRITICAL then HIGH then MEDIUM then LOW.
Keep chunk size small (recommended 20-40 files) to avoid context compaction issues.
Persist chunk progress in
.bug-hunter/state.json
so restarts do not re-scan done chunks.
Test files (CONTEXT-ONLY) are included only when needed for intent.
If the triage output shows
needsLoop: true
and
LOOP_MODE=false
(user passed
--no-loop
), warn the user: "This codebase has [N] source files (FILE_BUDGET: [B]). Single-pass mode will only cover a subset. Loop mode is recommended for thorough coverage (remove
--no-loop
to enable). Large codebases use domain-scoped auditing — see
modes/large-codebase.md
."
Execution Steps
Step 0: Preflight checks
Before doing anything else, verify the environment:
Resolve skill directory
Determine
SKILL_DIR
dynamically.
Preferred: derive it from the absolute path of the current
SKILL.md
(
dirname
of this file).
Fallback probe order:
$HOME/.agents/skills/bug-hunter
,
$HOME/.claude/skills/bug-hunter
,
$HOME/.codex/skills/bug-hunter
.
Use this path for ALL Read tool calls and shell commands.
Verify skill files exist
Run
ls "$SKILL_DIR/prompts/hunter.md"
via Bash. If this fails, stop and tell the user: "Bug Hunter skill files not found. Reinstall the skill and retry."
Node.js available
Run
node --version
via Bash. If it fails, stop and tell the user: "Node.js is required for doc verification. Please install Node.js to continue."
3b.
Create output directory
:
bash mkdir -p .bug-hunter/payloads .bug-hunter/domains
This directory stores all pipeline artifacts. Add
.bug-hunter/
to your project's
.gitignore
.
Doc lookup availability (optional, non-blocking)
Run a quick smoke test:
node "$SKILL_DIR/scripts/doc-lookup.cjs" search "express" "middleware"
If it returns results, set
DOC_LOOKUP_AVAILABLE=true
.
If it fails, try the fallback:
node "$SKILL_DIR/scripts/context7-api.cjs" search "express" "middleware"
If both fail, warn the user and set
DOC_LOOKUP_AVAILABLE=false
.
Missing
CONTEXT7_API_KEY
must NOT block execution; anonymous lookups may still work.
Verify helper scripts exist
:
ls "$SKILL_DIR/scripts/run-bug-hunter.cjs" "$SKILL_DIR/scripts/bug-hunter-state.cjs" "$SKILL_DIR/scripts/delta-mode.cjs" "$SKILL_DIR/scripts/payload-guard.cjs" "$SKILL_DIR/scripts/fix-lock.cjs" "$SKILL_DIR/scripts/triage.cjs" "$SKILL_DIR/scripts/doc-lookup.cjs"
If any are missing, stop and tell the user to update/reinstall the skill.
Note:
code-index.cjs
is optional — enables cross-domain dependency analysis for boundary audits in large-codebase mode, but the pipeline works fully without it.
Note:
context7-api.cjs
is kept as a fallback —
doc-lookup.cjs
is the primary doc verification script.
Note:
worktree-harvest.cjs
is optional — enables worktree-isolated Fixer dispatch for
subagent
/
teams
backends. Without it, Fixers edit directly on the fix branch (still safe via single-writer lock + auto-revert).
5b.
Check Context Hub CLI (recommended, non-blocking)
:
chub
--help
2
>
/dev/null
&&
chub update
2
>
/dev/null
If
chub
is available, set
CHUB_AVAILABLE=true
. Report:
✓ Context Hub available — using curated docs for verification.
If
chub
is NOT installed, set
CHUB_AVAILABLE=false
.
Warn the user visibly:
⚠️ Context Hub (chub) is not installed. Doc verification will fall back to Context7 API,
which has broader coverage but less curated results.
For better doc verification accuracy, install Context Hub:
npm install -g @aisuite/chub
More info: https://github.com/andrewyng/context-hub
Do NOT block the pipeline — Context7 fallback works, just with less curated results.
Select orchestration backend (cross-CLI portability)
:
Detect which dispatch tools are available in your runtime. Use the FIRST that works:
Option A —
subagent
tool (Pi agent, preferred for parallel):
Test: call
subagent({ action: "list" })
. If it returns without error, this backend works.
Set
AGENT_BACKEND = "subagent"
Dispatch pattern for each phase:
subagent({
agent: "-agent",
task: "",
output: ".bug-hunter/-output.md"
})
Read the output file after the subagent completes.
Option B —
teams
tool (Pi agent teams):
Test: does the
teams
tool exist in your available tools?
Set
AGENT_BACKEND = "teams"
Dispatch pattern:
teams({
tasks: [{ text: "" }],
maxTeammates: 1
})
Option C —
interactive_shell
(Claude Code, Codex, other CLI agents):
Set
AGENT_BACKEND = "interactive_shell"
Dispatch pattern:
interactive_shell({
command: 'pi ""',
mode: "dispatch"
})
Option D —
local-sequential
(default — always works):
Set
AGENT_BACKEND = "local-sequential"
Read
SKILL_DIR/modes/local-sequential.md
for full instructions.
You run all phases (Recon, Hunter, Skeptic, Referee) yourself,
sequentially, within your own context window.
Write phase outputs to
.bug-hunter/
files between phases.
IMPORTANT
:
local-sequential
is NOT a degraded mode. It is the expected
default for most environments and the skill works fully in this mode. Subagent
dispatch is an optimization for large codebases, not a requirement.
Rules:
Use exactly ONE backend for the whole run.
If a remote backend launch fails, fall back to the next option.
If all remote backends fail, use
local-sequential
and continue.
Step 1: Parse arguments, resolve target, and run triage
Follow the rules in the
Target
section above. If in branch diff or staged mode, run the appropriate git command now, collect the file list, and apply the filter.
Report to the user:
Mode (full project / directory / file / branch diff / staged)
Number of source files to scan (after filtering)
Number of files filtered out
Then run triage (zero-token strategy decision):
Run the triage script AFTER resolving the target. This is a pure Node.js filesystem scan — no tokens consumed, runs in <2 seconds even on 2,000+ file repos.
node
"
$SKILL_DIR
/scripts/triage.cjs"
scan
""
--output
.bug-hunter/triage.json
Then read
.bug-hunter/triage.json
. It contains:
strategy
which mode to use ("single-file", "small", "parallel", "extended", "scaled", "large-codebase")
modeFile
which mode file to read
fileBudget
computed from actual file sizes (sampled), not a guess
totalFiles
/
scannableFiles
exact count
domains
directory-level risk classification (CRITICAL/HIGH/MEDIUM/LOW/CONTEXT-ONLY)
riskMap
file-level classification (only present when ≤200 files)
domainFileLists
per-domain file lists (only present for large-codebase strategy)
scanOrder
priority-ordered list for Hunters
tokenEstimate
cost estimates for each pipeline phase
needsLoop
whether loop mode is needed for full coverage (loop is on by default; this indicates
--no-loop
would cause incomplete coverage)
Set these variables from the triage output:
STRATEGY = triage.strategy
FILE_BUDGET = triage.fileBudget
TOTAL_FILES = triage.totalFiles
SCANNABLE_FILES = triage.scannableFiles
NEEDS_LOOP = triage.needsLoop
Report to the user:
Triage: [TOTAL_FILES] source files | FILE_BUDGET: [FILE_BUDGET] | Strategy: [STRATEGY]
Domains: [N] CRITICAL, [N] HIGH, [N] MEDIUM, [N] LOW
Token estimate: ~[N] tokens for full pipeline
If triage says
needsLoop: true
and
LOOP_MODE=false
(user passed
--no-loop
), warn:
⚠️ This codebase has [N] source files (FILE_BUDGET: [B]).
Single-pass mode will only cover a subset. Remove --no-loop to enable iterative coverage.
Proceeding with partial scan — highest-priority queued files only.
Triage replaces Recon's FILE_BUDGET computation.
Recon still runs for tech stack identification and pattern-based analysis, but it no longer needs to count files or compute the context budget — triage already did that, for free.
Step 1b: Generate threat model (if --threat-model)
If
THREAT_MODEL_MODE=true
:
Check if
.bug-hunter/threat-model.md
already exists.
If it exists and was modified within the last 90 days: use it as-is. Set
THREAT_MODEL_AVAILABLE=true
.
If it exists but is >90 days old: warn user ("Threat model is N days old — regenerating"), regenerate.
If it doesn't exist: generate it.
To generate:
Read
$SKILL_DIR/prompts/threat-model.md
.
Dispatch the threat model generation agent (or execute locally if local-sequential).
Input: triage.json (if available) for file structure, or Glob-based discovery.
Wait for
.bug-hunter/threat-model.md
to be written.
Set
THREAT_MODEL_AVAILABLE=true
.
If
THREAT_MODEL_MODE=false
but
.bug-hunter/threat-model.md
exists:
Load it anyway — free context. Set
THREAT_MODEL_AVAILABLE=true
.
Report: "Existing threat model found — loading for enhanced security analysis."
Step 1c: Dependency scan (if --deps)
If
DEP_SCAN=true
:
node
"
$SKILL_DIR
/scripts/dep-scan.cjs"
--target
""
--output
.bug-hunter/dep-findings.json
Report to user:
Dependencies: [N] HIGH/CRITICAL CVEs found | [R] reachable, [P] potentially reachable, [U] not reachable
If
.bug-hunter/dep-findings.json
exists with REACHABLE findings, include them in Hunter context as "Known Vulnerable Dependencies" — Hunter should verify if vulnerable APIs are called in scanned source files.
Step 2: Read prompt files on demand (context efficiency)
MANDATORY
You MUST read prompt files using the Read tool before passing them to subagents or executing them yourself. Do NOT skip this or act from memory. Use the absolute SKILL_DIR path resolved in Step 0. Load only what you need for each phase — do NOT read all files upfront: Phase Read These Files Threat Model (Step 1b) prompts/threat-model.md (only if THREAT_MODEL_MODE=true) Recon (Step 4) prompts/recon.md (skip for single-file mode) Hunters (Step 5) prompts/hunter.md + prompts/doc-lookup.md + prompts/examples/hunter-examples.md Skeptics (Step 6) prompts/skeptic.md + prompts/doc-lookup.md + prompts/examples/skeptic-examples.md Referee (Step 7) prompts/referee.md Fixers (Phase 2) prompts/fixer.md + prompts/doc-lookup.md (only if FIX_MODE=true) Concrete examples for each backend: Example A: local-sequential (most common)

Phase B — launching Hunter yourself

1. Read the prompt file:

read({ path: "$SKILL_DIR/prompts/hunter.md" })

2. You now have the Hunter's full instructions. Execute them yourself:

- Read each file in risk-map order using the Read tool

- Apply the security checklist sweep

- Write each finding in BUG-N format

3. Write your canonical findings artifact to disk:

write({ path: ".bug-hunter/findings.json", content: "" }) Example B: subagent dispatch

Phase B — launching Hunter via subagent

1. Read the prompt:

read({ path: "$SKILL_DIR/prompts/hunter.md" })

2. Read the wrapper template:

read({ path: "$SKILL_DIR/templates/subagent-wrapper.md" })

3. Fill the template with:

- {ROLE_NAME} = "hunter"

- {ROLE_DESCRIPTION} = "Bug Hunter — find behavioral bugs in source code"

- {PROMPT_CONTENT} =

- {TARGET_DESCRIPTION} = "FindCoffee monorepo backend services"

- {FILE_LIST} =

- {RISK_MAP} =

- {TECH_STACK} =

- {PHASE_SPECIFIC_CONTEXT} =

- {OUTPUT_FILE_PATH} = ".bug-hunter/findings.json"

- {SKILL_DIR} =

4. Dispatch:

subagent({ agent: "hunter-agent", task: "", output: ".bug-hunter/findings.json" })

5. Read the output:

read({ path: ".bug-hunter/findings.json" })
When launching subagents, always pass
SKILL_DIR
explicitly in the task context so prompt commands like
node "$SKILL_DIR/scripts/doc-lookup.cjs"
resolve correctly. The
context7-api.cjs
script is kept as a fallback if
doc-lookup.cjs
fails.
Before every subagent launch, validate payload shape with:
node "$SKILL_DIR/scripts/payload-guard.cjs" validate "" ""
If validation fails, do NOT launch the subagent. Fix the payload first.
Any mode step that says "launch subagent" means "dispatch an agent task using
AGENT_BACKEND
". For
local-sequential
, "launch" means "execute that phase's instructions yourself."
After reading each prompt, extract the key instructions and pass the content to subagents via their system prompts. You do not need to keep the full text in working memory.
Context pruning for subagents:
When passing bug lists to Skeptics, Fixers, or the Referee, only include the bugs assigned to that agent — not the full merged list. For each bug, include: BUG-ID, severity, file, lines, claim, evidence, runtime trigger, cross-references. Omit: the Hunter's internal reasoning, scan coverage stats, and any "FILES SCANNED/SKIPPED" metadata. This keeps subagent prompts lean.
Step 3: Determine execution mode
Use the triage output from Step 1
— the strategy and FILE_BUDGET are already computed. Do NOT wait for Recon to determine the mode.
Read the corresponding mode file using
STRATEGY
from the triage JSON:
single-file
:
SKILL_DIR/modes/single-file.md
small
:
SKILL_DIR/modes/small.md
parallel
:
SKILL_DIR/modes/parallel.md
extended
:
SKILL_DIR/modes/extended.md
scaled
:
SKILL_DIR/modes/scaled.md
large-codebase
force
LOOP_MODE=true
and read
SKILL_DIR/modes/large-codebase.md
then
SKILL_DIR/modes/loop.md
Backend override for local-sequential:
If
AGENT_BACKEND = "local-sequential"
, read
SKILL_DIR/modes/local-sequential.md
instead of the size-based mode file. The local-sequential mode handles all sizes internally with its own chunking logic.
If LOOP_MODE=true, also read:
SKILL_DIR/modes/fix-loop.md
when FIX_MODE=true
SKILL_DIR/modes/loop.md
otherwise
CRITICAL — ralph-loop integration:
When
LOOP_MODE=true
, you MUST call the
ralph_start
tool before running the first pipeline iteration. The loop mode files (
loop.md
/
fix-loop.md
) contain the exact
ralph_start
call to make, including the
taskContent
and
maxIterations
parameters. Without calling
ralph_start
, the loop will NOT iterate — it will run once and stop. After each iteration, call
ralph_done
to continue, or output
COMPLETE
when done.
Report the chosen mode to the user.
Then follow the steps in the loaded mode file.
Each mode file contains the specific steps for running Recon, Hunters, Skeptics, and Referee for that mode. Each mode also references
modes/_dispatch.md
for backend-specific dispatch patterns. Execute them in order.
Branch-diff and staged optimization:
For
-b
and
--staged
modes, if the file count ≤ FILE_BUDGET, always use
small
or
parallel
mode regardless of total codebase size. The triage script already handles this since it only scans the provided target files.
For
extended
and
scaled
modes, initialize state before chunk execution:
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" init ".bug-hunter/state.json" "" "" 30
Then apply hash-based skip filtering before each chunk:
node "$SKILL_DIR/scripts/bug-hunter-state.cjs" hash-filter ".bug-hunter/state.json" ""
For full autonomous chunk orchestration with timeouts, retries, and journaling, extended/scaled modes can use:
node "$SKILL_DIR/scripts/run-bug-hunter.cjs" run --skill-dir "$SKILL_DIR" --files-json "" --mode ""
See
run-bug-hunter.cjs --help
for all options (delta-mode, canary-size, expand-on-low-confidence, etc.).
Step 7: Present the Final Report
After the mode-specific steps complete, display the final report:
1. Scan metadata
Mode (single-file / small / parallel-hybrid / extended / scaled / loop)
Files scanned: N source files (N filtered out)
Architecture: [summary from Recon]
Tech stack: [framework, auth, DB from Recon]
2. Pipeline summary
Triage: [N] source files | FILE_BUDGET: [B] | Strategy: [STRATEGY]
Recon: mapped N files -> CRITICAL: X | HIGH: Y | MEDIUM: Z | Tests: T
Hunters: [deep scan findings: W | optional triage findings: T | merged: U unique]
Gap-fill: [N files re-scanned, M additional findings] (or "not needed")
Skeptics: [challenged X | disproved: D, accepted: A]
Referee: confirmed N real bugs -> Critical: X | Medium: Y | Low: Z
3. Confirmed bugs table
(sorted by severity — from Referee output)
4. Low-confidence items
Flagged for manual review.
Include an
Auto-fix eligibility
field per bug:
ELIGIBLE
Referee confidence >= 75%
MANUAL_REVIEW
confidence < 75% or missing confidence If low-confidence items exist, expand scan scope from delta mode using trust-boundary overlays before finalizing report. 5. Dismissed findings In a collapsed
section (for transparency). 6. Agent accuracy stats Deep Hunter accuracy: X/Y confirmed (Z%) Optional triage value: N triage-only findings promoted to deep scan Skeptic accuracy: X/Y correct challenges (Z%) 7. Coverage assessment If ALL queued scannable source files scanned: "Full queued coverage achieved." If any missed: list them with note about --loop mode. 7b. Coverage enforcement (mandatory) If the coverage assessment shows ANY queued scannable source files were not scanned, the pipeline is NOT complete: If LOOP_MODE=true (default): the ralph-loop will automatically continue to the next iteration covering missed files. Call ralph_done to proceed to the next iteration. Do NOT output COMPLETE until all queued scannable source files show DONE. If LOOP_MODE=false ( --no-loop was specified) AND missed files exist: If total files ≤ FILE_BUDGET × 3: Output the report with a WARNING: ⚠️ PARTIAL COVERAGE: [N] queued source files were not scanned. Run `/bug-hunter [path]` for complete coverage (loop is on by default). Unscanned files: [list them] If total files > FILE_BUDGET × 3: The report MUST include: 🚨 LARGE CODEBASE: [N] source files (FILE_BUDGET: [B]). Single-pass audit covered [X]% of queued source files. Use `/bug-hunter [path]` for full coverage (loop is on by default). Do NOT claim "audit complete" or "full coverage achieved" unless ALL queued scannable source files have status DONE. A partial audit is still valuable — report what you found honestly. Autonomous runs must keep descending through the remaining priority queue after the current prioritized chunk is done: Finish current CRITICAL/HIGH work first. Immediately continue with remaining MEDIUM files. Then continue with remaining LOW files. Only stop when the queue is exhausted, the user interrupts, or a hard blocker prevents safe progress. If zero bugs were confirmed, say so clearly — a clean report is a good result. Routing after report: If confirmed bugs > 0 AND FIX_MODE=true : Auto-fix only ELIGIBLE bugs. Apply canary-first rollout: fix top critical eligible subset first, verify, then continue remaining eligible fixes. Keep MANUAL_REVIEW bugs in report only (do not auto-edit). Run final global consistency pass over merged findings before applying fixes. Read SKILL_DIR/modes/fix-pipeline.md and execute Phase 2 on eligible subset. If confirmed bugs > 0 AND FIX_MODE=false : stop after report (scan-only mode). If zero bugs confirmed: stop here. The report is the final output. 8. JSON output (always generated) After the markdown report, write a machine-readable findings file to .bug-hunter/findings.json : { "version" : "3.0.0" , "scan_id" : "scan-YYYY-MM-DD-HHmmss" , "scan_date" : "" , "mode" : "" , "target" : "" , "files_scanned" : 0 , "threat_model_loaded" : false , "confirmed" : [ { "id" : "BUG-1" , "severity" : "CRITICAL" , "category" : "security" , "stride" : "Tampering" , "cwe" : "CWE-89" , "file" : "src/api/users.ts" , "lines" : "45-49" , "claim" : "SQL injection via unsanitized query parameter" , "reachability" : "EXTERNAL" , "exploitability" : "EASY" , "cvss_vector" : "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N" , "cvss_score" : 9.1 , "poc" : { "payload" : "..." , "request" : "..." , "expected" : "..." , "actual" : "..." } } ] , "dismissed" : [ { "id" : "BUG-3" , "severity" : "Medium" , "category" : "logic" , "file" : "..." , "claim" : "..." , "reason" : "..." } ] , "dependencies" : [ ] , "summary" : { "total_reported" : 0 , "confirmed" : 0 , "dismissed" : 0 , "by_severity" : { "CRITICAL" : 0 , "HIGH" : 0 , "MEDIUM" : 0 , "LOW" : 0 } , "by_stride" : { "Tampering" : 0 , "InfoDisclosure" : 0 , "ElevationOfPrivilege" : 0 , "Spoofing" : 0 , "DoS" : 0 , "Repudiation" : 0 , "N/A" : 0 } , "by_category" : { "security" : 0 , "logic" : 0 , "error-handling" : 0 } } } Rules for JSON output: Non-security findings: stride: "N/A" , cwe: "N/A" , omit reachability/CVSS/PoC fields. Security findings without CRITICAL/HIGH severity: omit CVSS and PoC fields. dependencies array: populated only if --deps was used and .bug-hunter/dep-findings.json exists. This JSON enables CI/CD gating, dashboard ingestion, and downstream patch generation. Also write the final markdown report to .bug-hunter/report.md as the canonical human-readable output. Generate it from the JSON artifacts with: node " $SKILL_DIR /scripts/render-report.cjs" report ".bug-hunter/findings.json" ".bug-hunter/referee.json" > ".bug-hunter/report.md" Self-Test Mode To validate the pipeline works end-to-end, run /bug-hunter SKILL_DIR/test-fixture/ on the included test fixture. This directory contains a small Express app with 6 intentionally planted bugs (2 Critical, 3 Medium, 1 Low). Expected results: Recon should classify 3 files as CRITICAL, 1 as HIGH Hunters should find all 6 bugs (possibly more false positives) Skeptic should challenge at least 1 false positive Referee should confirm all 6 planted bugs If the pipeline finds fewer than 5 of the 6 planted bugs, the prompts need tuning. If it reports more than 3 false positives that survive to the Referee, the Skeptic prompt needs tightening. The test fixture source files ship with the skill. If using --fix mode on the fixture, initialize its git repo first: bash SKILL_DIR/scripts/init-test-fixture.sh Error handling Step Failure Fallback Triage script error Skip triage, Recon does full classification with FILE_BUDGET=40 default Recon timeout/error Skip Recon, Hunters use triage scanOrder (or Glob-based discovery if no triage) Optional scout pass timeout/error Disable scout, continue with deep Hunter Deep Hunter timeout/error Retry once on narrowed chunk, otherwise report partial coverage Orchestration backend launch failure Fall back to next backend (subagent → teams → interactive_shell → local-sequential) Gap-fill Hunter timeout/error Note missed files, continue Payload guard validation fails Do not launch subagent; fix payload and retry Chunk orchestrator timeout/error Retry with exponential backoff, then mark chunk failed Skeptic timeout/error Use single Skeptic or accept all findings as-is Referee timeout/error Use Skeptic's accepted list as final result Git safety (Step 8a) not a git repo Warn user, skip branching Git safety (Step 8a) stash/branch fails Warn, continue without safety net Fix lock lock held Stop Phase 2, report concurrent fixer run Test baseline (Step 8c) timeout/not found Set BASELINE=null, skip test verification Fixer timeout/error Mark unfixed bugs as SKIPPED Post-fix tests new failures Auto-revert failed fix commit, mark FIX_REVERTED Post-fix re-scan timeout/error Skip re-scan, note "fixer output not re-verified" Worktree prepare git worktree add fails Fall back to WORKTREE_MODE=false (direct edit mode) for this run Worktree harvest no commits found, dirty Stash uncommitted work, mark bugs as FIX_FAILED (reason: fixer-did-not-commit) Worktree harvest branch switched Mark all bugs in batch as FIX_FAILED (reason: branch-switched) Worktree cleanup git worktree remove fails Force-remove directory, run git worktree prune Stale worktrees from previous crash cleanup-all at Step 8a-wt removes them before starting Fix lock release release fails Warn user to clear .bug-hunter/fix.lock manually
返回排行榜