Pentest Validation When validating security findings: REQUIRE explicit authorization for target URL SCAN with qe-security-scanner (SAST + dependency + secrets) ANALYZE with qe-security-reviewer + qe-security-auditor (parallel) VALIDATE with qe-pentest-validator (graduated exploitation, parallel per vuln type) REPORT only confirmed findings with PoC evidence ("No Exploit, No Report") UPDATE exploit playbook with new patterns Quality Gates: Authorization confirmed before ANY exploitation Target URL is staging/dev (NOT production) Budget cap enforced ($15 default) Time cap enforced (30 min default) All exploitation attempts logged Quick Reference Card The 4-Phase Pipeline Phase Agent(s) Purpose Parallelism 1. Recon qe-security-scanner SAST, DAST, dependency scan, secrets Internal parallel 2. Analysis qe-security-reviewer + qe-security-auditor Code review + compliance check Both in parallel 3. Validation qe-pentest-validator Graduated exploit validation Per-vuln-type parallel 4. Report qe-quality-gate "No Exploit, No Report" filter Sequential Graduated Exploitation Tiers Tier Handler Cost Latency Use When 1 Agent Booster (WASM) $0 <1ms Code pattern is conclusive (eval, innerHTML, hardcoded creds) 2 Haiku $0.0002 ~500ms Need payload test against live target 3 Sonnet/Opus $0.003-$0.015 2-5s Full exploit chain with data proof When to Use This Skill Scenario Tier Estimated Cost PR security review (source only) 1 $0 Pre-release validation (staging) 1-2 $1-5 Full pentest validation 1-3 $5-15 Compliance audit evidence 1-3 $5-15 Configuration pentest : target_url : https : //staging.app.com

REQUIRED for Tier 2-3

source_repo : ./src

REQUIRED for Tier 1+

exploitation_tier : 2

1=pattern-only, 2=payload-test, 3=full-exploit

vuln_types :

Which pipelines to run

- injection

SQL, NoSQL, command injection

- xss

Reflected, stored, DOM XSS

- auth

Auth bypass, session, JWT

- ssrf

URL scheme abuse, metadata

max_cost_usd : 15

Budget cap per run

timeout_minutes : 30

Time cap per run

require_authorization : true

MUST confirm target ownership

no_production : true

Block production URLs

production_patterns :

URL patterns to block

-

".prod."

-

"api.*"

-

"www.*"

Safeguards (Mandatory)

Authorization Gate

Every pentest validation run MUST:

Display target URL and exploitation tier to user

Require explicit confirmation: "I own/authorized testing of this target"

Log authorization with timestamp

Block if target URL matches production patterns

What This Skill Does NOT Do

Full autonomous reconnaissance (Nmap, Subfinder)

Zero-day exploit development

Attack targets without explicit authorization

Test production systems

Store actual exfiltrated data (only proof of access)

Social engineering or phishing simulation

Port scanning or service discovery

Validation Pipelines

Injection Pipeline

Attack

Tier 1 (Pattern)

Tier 2 (Payload)

Tier 3 (Full)

SQL injection

String concat in query

' OR '1'='1

response diff

UNION SELECT data extraction

NoSQL injection

$where

,

$gt

in query

Operator injection test

Collection enumeration

Command injection

exec()

,

system()

calls

Command delimiter test

Reverse shell proof

LDAP injection

String concat in filter

Wildcard injection

Directory enumeration

XSS Pipeline

Attack

Tier 1 (Pattern)

Tier 2 (Payload)

Tier 3 (Full)

Reflected XSS

No output encoding

reflection

Browser JS execution via Playwright

Stored XSS

innerHTML

assignment

Payload stored + retrieved

Cookie theft PoC

DOM XSS

document.write(location)

Fragment injection

DOM manipulation proof

Auth Pipeline

Attack

Tier 1 (Pattern)

Tier 2 (Payload)

Tier 3 (Full)

JWT none

No algorithm validation

Modified JWT accepted

Admin access with forged token

Session fixation

No session rotation

Pre-set session reused

Cross-user session hijack

Credential stuffing

No rate limiting

100 attempts unblocked

Valid credential discovery

IDOR

No authorization check

Access other user data

Full CRUD on foreign resources

SSRF Pipeline

Attack

Tier 1 (Pattern)

Tier 2 (Payload)

Tier 3 (Full)

Internal URL

User-controlled URL fetch

http://169.254.169.254

Cloud metadata extraction

DNS rebinding

URL validation bypass

Rebind to internal IP

Internal service access

Protocol smuggling

URL scheme not restricted

file:///etc/passwd

File content in response

Agent Coordination

Orchestration Pattern

// Phase 1: Recon (parallel scans)

await

Task

(

"Security Scan"

,

{

target

:

"./src"

,

layers

:

{

sast

:

true

,

dast

:

true

,

dependencies

:

true

,

secrets

:

true

}

,

"qe-security-scanner"

)

;

// Phase 2: Analysis (parallel review)

await

Promise

.

all

(

[

Task

(

"Code Security Review"

,

{

findings

:

phase1Results

,

depth

:

"comprehensive"

}

,

"qe-security-reviewer"

)

,

Task

(

"Compliance Audit"

,

{

findings

:

phase1Results

,

frameworks

:

[

"owasp-top-10"

]

}

,

"qe-security-auditor"

)

]

)

;

// Phase 3: Validation (graduated exploitation)

await

Task

(

"Exploit Validation"

,

{

findings

:

[

...

phase1Results

,

...

phase2Results

]

,

target_url

:

"https://staging.app.com"

,

exploitation_tier

:

2

,

vuln_types

:

[

"injection"

,

"xss"

,

"auth"

,

"ssrf"

]

,

max_cost_usd

:

15

,

timeout_minutes

:

30

}

,

"qe-pentest-validator"

)

;

// Phase 4: Report ("No Exploit, No Report" gate)

await

Task

(

"Security Quality Gate"

,

{

findings

:

phase3Results

.

confirmedFindings

,

gate

:

"no-exploit-no-report"

,

require_poc

:

true

}

,

"qe-quality-gate"

)

;

Finding Classification

Status

Meaning

Action

confirmed-exploitable

Exploitation succeeded with PoC

Report with evidence

likely-exploitable

Partial exploitation, defenses detected

Report with caveats

not-exploitable

All exploitation attempts failed

Filter from report

inconclusive

WAF/defense blocked, unclear if vulnerable

Report for manual review

Exploit Playbook Memory

Namespace Structure

aqe/pentest/

playbook/

exploit/{vuln_type}/{tech_stack}/{technique}

bypass/{defense_type}/{technique}

payload/{vuln_type}/{variant}

results/

validation-{timestamp}

poc/

{finding_id}-poc

Learning Loop

Before validation

Query playbook for known patterns matching findings

During validation

Try known payloads first (higher success rate)

After validation

Store new successful patterns with confidence scores
Over time: Agent converges on most effective payloads per tech stack Cost Optimization Estimated Cost by Scenario Scenario Tier Mix Findings Est. Cost Est. Time PR check (source only) 100% Tier 1 5 $0 <5s Sprint validation 70% T1, 30% T2 15 $2-5 5-10 min Release validation 40% T1, 40% T2, 20% T3 25 $8-15 15-30 min Full pentest 20% T1, 30% T2, 50% T3 40 $15-30 30-60 min Cost vs Shannon Comparison Metric Shannon AQE Pentest Validation Cost per run ~$50 $5-15 (graduated tiers) Runtime 60-90 min 15-30 min (parallel pipelines) False positive rate Low (exploit-proven) Low (same principle) Learning None (static prompts) ReasoningBank playbook Success Metrics Metric Target Measurement False positive reduction

60% of findings eliminated Pre/post validator comparison Exploit confirmation rate 80% of confirmed findings truly exploitable Manual PoC verification Cost per run <$15 USD Token tracking per pipeline Time per run <30 minutes Execution time metrics Playbook growth 100+ patterns after 6 months Memory namespace count

pentest-validation

安装

REQUIRED for Tier 2-3

REQUIRED for Tier 1+

1=pattern-only, 2=payload-test, 3=full-exploit

Which pipelines to run

SQL, NoSQL, command injection

Reflected, stored, DOM XSS

Auth bypass, session, JWT

URL scheme abuse, metadata

Budget cap per run

Time cap per run

MUST confirm target ownership

Block production URLs

URL patterns to block