pentest-validation

安装量: 55
排名: #13578

安装

npx skills add https://github.com/proffesor-for-testing/agentic-qe --skill pentest-validation

Pentest Validation When validating security findings: REQUIRE explicit authorization for target URL SCAN with qe-security-scanner (SAST + dependency + secrets) ANALYZE with qe-security-reviewer + qe-security-auditor (parallel) VALIDATE with qe-pentest-validator (graduated exploitation, parallel per vuln type) REPORT only confirmed findings with PoC evidence ("No Exploit, No Report") UPDATE exploit playbook with new patterns Quality Gates: Authorization confirmed before ANY exploitation Target URL is staging/dev (NOT production) Budget cap enforced ($15 default) Time cap enforced (30 min default) All exploitation attempts logged Quick Reference Card The 4-Phase Pipeline Phase Agent(s) Purpose Parallelism 1. Recon qe-security-scanner SAST, DAST, dependency scan, secrets Internal parallel 2. Analysis qe-security-reviewer + qe-security-auditor Code review + compliance check Both in parallel 3. Validation qe-pentest-validator Graduated exploit validation Per-vuln-type parallel 4. Report qe-quality-gate "No Exploit, No Report" filter Sequential Graduated Exploitation Tiers Tier Handler Cost Latency Use When 1 Agent Booster (WASM) $0 <1ms Code pattern is conclusive (eval, innerHTML, hardcoded creds) 2 Haiku $0.0002 ~500ms Need payload test against live target 3 Sonnet/Opus $0.003-$0.015 2-5s Full exploit chain with data proof When to Use This Skill Scenario Tier Estimated Cost PR security review (source only) 1 $0 Pre-release validation (staging) 1-2 $1-5 Full pentest validation 1-3 $5-15 Compliance audit evidence 1-3 $5-15 Configuration pentest : target_url : https : //staging.app.com

REQUIRED for Tier 2-3

source_repo : ./src

REQUIRED for Tier 1+

exploitation_tier : 2

1=pattern-only, 2=payload-test, 3=full-exploit

vuln_types :

Which pipelines to run

- injection

SQL, NoSQL, command injection

- xss

Reflected, stored, DOM XSS

- auth

Auth bypass, session, JWT

- ssrf

URL scheme abuse, metadata

max_cost_usd : 15

Budget cap per run

timeout_minutes : 30

Time cap per run

require_authorization : true

MUST confirm target ownership

no_production : true

Block production URLs

production_patterns :

URL patterns to block

-
".prod."
-
"api.*"
-
"www.*"
Safeguards (Mandatory)
Authorization Gate
Every pentest validation run MUST:
Display target URL and exploitation tier to user
Require explicit confirmation: "I own/authorized testing of this target"
Log authorization with timestamp
Block if target URL matches production patterns
What This Skill Does NOT Do
Full autonomous reconnaissance (Nmap, Subfinder)
Zero-day exploit development
Attack targets without explicit authorization
Test production systems
Store actual exfiltrated data (only proof of access)
Social engineering or phishing simulation
Port scanning or service discovery
Validation Pipelines
Injection Pipeline
Attack
Tier 1 (Pattern)
Tier 2 (Payload)
Tier 3 (Full)
SQL injection
String concat in query
' OR '1'='1
response diff
UNION SELECT data extraction
NoSQL injection
$where
,
$gt
in query
Operator injection test
Collection enumeration
Command injection
exec()
,
system()
calls
Command delimiter test
Reverse shell proof
LDAP injection
String concat in filter
Wildcard injection
Directory enumeration
XSS Pipeline
Attack
Tier 1 (Pattern)
Tier 2 (Payload)
Tier 3 (Full)
Reflected XSS
No output encoding
reflection
Browser JS execution via Playwright
Stored XSS
innerHTML
assignment
Payload stored + retrieved
Cookie theft PoC
DOM XSS
document.write(location)
Fragment injection
DOM manipulation proof
Auth Pipeline
Attack
Tier 1 (Pattern)
Tier 2 (Payload)
Tier 3 (Full)
JWT none
No algorithm validation
Modified JWT accepted
Admin access with forged token
Session fixation
No session rotation
Pre-set session reused
Cross-user session hijack
Credential stuffing
No rate limiting
100 attempts unblocked
Valid credential discovery
IDOR
No authorization check
Access other user data
Full CRUD on foreign resources
SSRF Pipeline
Attack
Tier 1 (Pattern)
Tier 2 (Payload)
Tier 3 (Full)
Internal URL
User-controlled URL fetch
http://169.254.169.254
Cloud metadata extraction
DNS rebinding
URL validation bypass
Rebind to internal IP
Internal service access
Protocol smuggling
URL scheme not restricted
file:///etc/passwd
File content in response
Agent Coordination
Orchestration Pattern
// Phase 1: Recon (parallel scans)
await
Task
(
"Security Scan"
,
{
target
:
"./src"
,
layers
:
{
sast
:
true
,
dast
:
true
,
dependencies
:
true
,
secrets
:
true
}
}
,
"qe-security-scanner"
)
;
// Phase 2: Analysis (parallel review)
await
Promise
.
all
(
[
Task
(
"Code Security Review"
,
{
findings
:
phase1Results
,
depth
:
"comprehensive"
}
,
"qe-security-reviewer"
)
,
Task
(
"Compliance Audit"
,
{
findings
:
phase1Results
,
frameworks
:
[
"owasp-top-10"
]
}
,
"qe-security-auditor"
)
]
)
;
// Phase 3: Validation (graduated exploitation)
await
Task
(
"Exploit Validation"
,
{
findings
:
[
...
phase1Results
,
...
phase2Results
]
,
target_url
:
"https://staging.app.com"
,
exploitation_tier
:
2
,
vuln_types
:
[
"injection"
,
"xss"
,
"auth"
,
"ssrf"
]
,
max_cost_usd
:
15
,
timeout_minutes
:
30
}
,
"qe-pentest-validator"
)
;
// Phase 4: Report ("No Exploit, No Report" gate)
await
Task
(
"Security Quality Gate"
,
{
findings
:
phase3Results
.
confirmedFindings
,
gate
:
"no-exploit-no-report"
,
require_poc
:
true
}
,
"qe-quality-gate"
)
;
Finding Classification
Status
Meaning
Action
confirmed-exploitable
Exploitation succeeded with PoC
Report with evidence
likely-exploitable
Partial exploitation, defenses detected
Report with caveats
not-exploitable
All exploitation attempts failed
Filter from report
inconclusive
WAF/defense blocked, unclear if vulnerable
Report for manual review
Exploit Playbook Memory
Namespace Structure
aqe/pentest/
playbook/
exploit/{vuln_type}/{tech_stack}/{technique}
bypass/{defense_type}/{technique}
payload/{vuln_type}/{variant}
results/
validation-{timestamp}
poc/
{finding_id}-poc
Learning Loop
Before validation
Query playbook for known patterns matching findings
During validation
Try known payloads first (higher success rate)
After validation
Store new successful patterns with confidence scores
Over time
Agent converges on most effective payloads per tech stack Cost Optimization Estimated Cost by Scenario Scenario Tier Mix Findings Est. Cost Est. Time PR check (source only) 100% Tier 1 5 $0 <5s Sprint validation 70% T1, 30% T2 15 $2-5 5-10 min Release validation 40% T1, 40% T2, 20% T3 25 $8-15 15-30 min Full pentest 20% T1, 30% T2, 50% T3 40 $15-30 30-60 min Cost vs Shannon Comparison Metric Shannon AQE Pentest Validation Cost per run ~$50 $5-15 (graduated tiers) Runtime 60-90 min 15-30 min (parallel pipelines) False positive rate Low (exploit-proven) Low (same principle) Learning None (static prompts) ReasoningBank playbook Success Metrics Metric Target Measurement False positive reduction

60% of findings eliminated Pre/post validator comparison Exploit confirmation rate 80% of confirmed findings truly exploitable Manual PoC verification Cost per run <$15 USD Token tracking per pipeline Time per run <30 minutes Execution time metrics Playbook growth 100+ patterns after 6 months Memory namespace count

返回排行榜