delivery-manager

安装量: 82
排名: #9635

安装

npx skills add https://github.com/borghei/claude-skills --skill delivery-manager

Delivery Manager

Expert-level delivery management for continuous software delivery.

Core Competencies Release management Deployment coordination Continuous delivery Service operations Incident management Change management Capacity planning SLA management Delivery Framework Delivery Pipeline CODE → BUILD → TEST → STAGE → DEPLOY → MONITOR │ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ▼ Commit Compile Unit Integr Prod Observe Review Package Integr UAT Canary Alert Lint Artifact Perf Approval Blue/Grn Respond

Delivery Maturity Model LEVEL 1: Manual Delivery ├── Manual builds ├── Manual testing ├── Manual deployments └── Reactive monitoring

LEVEL 2: Automated Build/Test ├── CI pipeline ├── Automated unit tests ├── Manual deployments └── Basic monitoring

LEVEL 3: Continuous Delivery ├── Full CI/CD pipeline ├── Automated testing ├── Push-button deployments └── Comprehensive monitoring

LEVEL 4: Continuous Deployment ├── Automated deployments ├── Feature flags ├── Canary releases └── Self-healing systems

LEVEL 5: DevOps Excellence ├── Zero-downtime deployments ├── Automated rollbacks ├── Chaos engineering └── Full observability

Release Management Release Planning

Release Plan: v2.5.0

Release Overview

  • Version: 2.5.0
  • Type: Minor Release (new features)
  • Target Date: January 25, 2024
  • Release Manager: [Name]

Scope

Features

| ID | Feature | Owner | Status |

|----|---------|-------|--------|

| F-101 | New checkout flow | Team A | Ready |

| F-102 | Performance improvements | Team B | Ready |

| F-103 | Admin dashboard updates | Team C | In Testing |

Bug Fixes

| ID | Description | Priority | Status |

|----|-------------|----------|--------|

| B-201 | Cart total calculation | High | Fixed |

| B-202 | Login timeout issue | Medium | Fixed |

Dependencies

  • Database migration required
  • API version update
  • Third-party SDK update

Release Criteria

Exit Criteria

  • [ ] All P1/P2 bugs resolved
  • [ ] Performance benchmarks met
  • [ ] Security scan passed
  • [ ] Load testing complete
  • [ ] UAT sign-off received
  • [ ] Documentation updated
  • [ ] Runbook reviewed

Go/No-Go Decision

  • Decision date: January 24, 2024
  • Decision maker: Release Manager + Product Owner
  • Criteria: All exit criteria met

Rollout Strategy

  • Deployment window: January 25, 10:00-14:00 UTC
  • Method: Blue-green deployment
  • Rollback plan: Immediate switch to blue environment

Communication Plan

| When | Who | What |

|------|-----|------|

| T-7 days | Internal teams | Release scope finalized |

| T-1 day | All stakeholders | Go/No-Go decision |

| T-0 | Customer success | Release notes |

| T+1 day | Customers | Changelog email |

Post-Release

  • Monitoring period: 24 hours
  • Success metrics: Error rate <0.1%, latency P99 <500ms
  • Retrospective: January 26, 2024

Release Calendar RELEASE CALENDAR - Q1 2024

January Week 1 Week 2 Week 3 Week 4 ───────────────────────────────────── Code Testing UAT Release Freeze Phase Phase v2.5.0 v2.5 ◆

February Week 1 Week 2 Week 3 Week 4 ───────────────────────────────────── Dev Code Testing Release Sprint Freeze Phase v2.6.0 v2.6 ◆

RELEASE TYPES ◆ Major Release (quarterly) ● Minor Release (monthly) ○ Patch Release (as needed)

Version Management SEMANTIC VERSIONING: MAJOR.MINOR.PATCH

MAJOR (X.0.0) ├── Breaking changes ├── Major features └── Quarterly cadence

MINOR (x.X.0) ├── New features ├── Backward compatible └── Monthly cadence

PATCH (x.x.X) ├── Bug fixes ├── Security patches └── As needed

Deployment Strategies Deployment Patterns

Blue-Green Deployment:

┌─────────────────────────────────────────────────┐ │ LOAD BALANCER │ └─────────────────────┬───────────────────────────┘ │ ┌─────────────┴─────────────┐ │ │ ▼ ▼ ┌───────────────┐ ┌───────────────┐ │ BLUE (v2.4) │ │ GREEN (v2.5) │ │ [Active] │ │ [Staging] │ │ │ │ │ │ ● ● ● ● ● │ │ ● ● ● ● ● │ │ 5 instances │ │ 5 instances │ └───────────────┘ └───────────────┘

SWITCH: Route traffic from Blue → Green ROLLBACK: Route traffic from Green → Blue

Canary Deployment:

┌─────────────────────────────────────────────────┐ │ TRAFFIC DISTRIBUTION │ ├─────────────────────────────────────────────────┤ │ │ │ Stage 1: [█████████████████████████░] 95%/5% │ │ Stage 2: [████████████████████░░░░░░] 75%/25% │ │ Stage 3: [██████████░░░░░░░░░░░░░░░░] 50%/50% │ │ Stage 4: [░░░░░░░░░░░░░░░░░░░░░░░░░█] 0%/100% │ │ │ │ ████ = Old Version ░░░░ = New Version │ └─────────────────────────────────────────────────┘

Rolling Deployment:

TIME → ──────────────────────────────────────────────────── t0: [v1][v1][v1][v1][v1] All instances v1 t1: [v2][v1][v1][v1][v1] 1 instance updated t2: [v2][v2][v1][v1][v1] 2 instances updated t3: [v2][v2][v2][v1][v1] 3 instances updated t4: [v2][v2][v2][v2][v1] 4 instances updated t5: [v2][v2][v2][v2][v2] All instances v2

Deployment Checklist

Deployment Checklist

Pre-Deployment

  • [ ] All tests passing
  • [ ] Security scan complete
  • [ ] Change request approved
  • [ ] Runbook reviewed
  • [ ] Rollback plan documented
  • [ ] Communication sent
  • [ ] On-call team notified
  • [ ] Maintenance window confirmed

Deployment

  • [ ] Database backup complete
  • [ ] Configuration verified
  • [ ] Health checks configured
  • [ ] Deployment initiated
  • [ ] Progress monitored
  • [ ] Health checks passing
  • [ ] Smoke tests passing

Post-Deployment

  • [ ] All services healthy
  • [ ] Metrics within thresholds
  • [ ] No increase in errors
  • [ ] User acceptance verified
  • [ ] Documentation updated
  • [ ] Stakeholders notified
  • [ ] Change request closed

Rollback Triggers

  • Error rate > 1%
  • Latency P99 > 2s
  • Health check failures
  • Critical bug discovered

Incident Management Incident Response INCIDENT SEVERITY LEVELS

SEV-1: Critical ├── Complete service outage ├── Data loss or security breach ├── Response: Immediate (15 min) └── Resolution target: 4 hours

SEV-2: High ├── Major feature unavailable ├── Significant performance degradation ├── Response: 30 minutes └── Resolution target: 8 hours

SEV-3: Medium ├── Minor feature impact ├── Workaround available ├── Response: 2 hours └── Resolution target: 24 hours

SEV-4: Low ├── Cosmetic issues ├── No customer impact ├── Response: 8 hours └── Resolution target: 5 days

Incident Process DETECT → TRIAGE → RESPOND → RESOLVE → REVIEW │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ Alert Assess Incident Fix & Post- Fire Severity Commander Verify mortem Monitor Assign Coordinate Deploy Action Page Notify Communicate Test Items

Incident Template

Incident Report: INC-2024-0125

Summary

[One sentence description of the incident]

Timeline

| Time (UTC) | Event |

|------------|-------|

| 10:15 | Alert triggered |

| 10:18 | On-call paged |

| 10:25 | Incident declared |

| 10:45 | Root cause identified |

| 11:00 | Fix deployed |

| 11:15 | Service restored |

Impact

  • Duration: 1 hour
  • Affected services: [List]
  • Customer impact: [Description]
  • SLA impact: [Yes/No, details]

Root Cause

[Description of what caused the incident]

Resolution

[What was done to resolve the incident]

Action Items

| Action | Owner | Due Date | Status |

|--------|-------|----------|--------|

| [Action 1] | [Name] | [Date] | Open |

| [Action 2] | [Name] | [Date] | Open |

Lessons Learned

  • What went well: [List]
  • What could improve: [List]
  • Follow-up items: [List]

Change Management Change Types Type Description Approval Lead Time Standard Pre-approved, low risk None 0 Normal Regular change CAB 5 days Expedited Urgent business need Manager 24 hours Emergency Critical fix On-call 0 Change Request

Change Request: CHG-2024-0150

Change Details

  • Title: Database schema migration
  • Type: Normal
  • Risk: Medium
  • Environment: Production

Requestor

  • Name: [Name]
  • Team: [Team]
  • Date: January 20, 2024

Description

[Detailed description of the change]

Justification

[Business reason for the change]

Impact Analysis

  • Systems affected: [List]
  • Services affected: [List]
  • Users affected: [Number]
  • Downtime required: [Duration]

Implementation Plan

  1. [Step 1]
  2. [Step 2]
  3. [Step 3]

Rollback Plan

  1. [Rollback step 1]
  2. [Rollback step 2]

Testing Plan

  • [ ] Unit tests
  • [ ] Integration tests
  • [ ] UAT sign-off

Schedule

  • Proposed date: January 25, 2024
  • Start time: 02:00 UTC
  • Duration: 2 hours
  • Change freeze exception: No

Approvals

| Role | Name | Status | Date |

|------|------|--------|------|

| Technical Lead | | Pending | |

| Product Owner | | Pending | |

| CAB | | Pending | |

SLA Management SLA Framework ┌─────────────────────────────────────────────────────────────────┐ │ SLA DASHBOARD │ ├─────────────────────────────────────────────────────────────────┤ │ Service: Customer Portal Period: January 2024 │ ├─────────────────────────────────────────────────────────────────┤ │ AVAILABILITY │ │ Target: 99.9% Actual: 99.95% Status: ✓ Met │ │ Downtime allowed: 43.8 min Used: 21.6 min Remaining: ✓ │ ├─────────────────────────────────────────────────────────────────┤ │ PERFORMANCE │ │ Latency P99 Target: <500ms Actual: 320ms Status: ✓ Met │ │ Error Rate Target: <0.1% Actual: 0.05% Status: ✓ Met │ ├─────────────────────────────────────────────────────────────────┤ │ SUPPORT │ │ SEV-1 Response Target: 15 min Avg: 12 min Status: ✓ Met │ │ SEV-2 Response Target: 30 min Avg: 25 min Status: ✓ Met │ └─────────────────────────────────────────────────────────────────┘

Error Budget ERROR BUDGET CALCULATION

SLA: 99.9% availability Error Budget: 0.1% = 43.8 minutes/month

BUDGET CONSUMPTION ├── Incident 1: 15 min ├── Incident 2: 5 min ├── Maintenance: 0 min (scheduled) └── Total used: 20 min

REMAINING BUDGET └── 23.8 min (54% remaining)

BURN RATE └── Current: 0.8x (on track)

Metrics & Reporting Delivery Metrics Metric Definition Target Deployment Frequency Deploys per day/week 10+/week Lead Time Commit to production <1 day Change Failure Rate Failed deployments <5% MTTR Mean time to recovery <1 hour Availability Uptime percentage 99.9% Delivery Dashboard ┌─────────────────────────────────────────────────────────────────┐ │ DELIVERY PERFORMANCE - [Month] │ ├─────────────────────────────────────────────────────────────────┤ │ Deployments Lead Time Failure Rate MTTR │ │ 48 4.2 hours 2.1% 45 min │ │ +20% MoM -30% MoM -1% MoM -15 min MoM │ ├─────────────────────────────────────────────────────────────────┤ │ DEPLOYMENT TREND │ │ [Line chart showing deployments over time] │ ├─────────────────────────────────────────────────────────────────┤ │ INCIDENT SUMMARY │ │ Total: 5 SEV-1: 0 SEV-2: 2 SEV-3: 3 │ │ MTTR: 45 min MTTD: 5 min │ ├─────────────────────────────────────────────────────────────────┤ │ UPCOMING RELEASES │ │ v2.6.0: Feb 28 v2.7.0: Mar 28 v3.0.0: Apr 30 │ └─────────────────────────────────────────────────────────────────┘

Reference Materials references/release_process.md - Release management guide references/deployment_patterns.md - Deployment strategies references/incident_management.md - Incident response references/sla_management.md - SLA framework Scripts

Release readiness checker

python scripts/release_checker.py --version v2.5.0

Deployment coordinator

python scripts/deploy.py --env production --strategy canary

SLA calculator

python scripts/sla_calculator.py --service portal --period month

Incident reporter

python scripts/incident_report.py --id INC-2024-0125

返回排行榜