Smart Analysis

Intelligently select and apply the most appropriate Kaizen analysis technique based on what you're analyzing.

Description

Analyzes context and chooses best method: Gemba Walk (code exploration), Value Stream Mapping (workflow/process), or Muda Analysis (waste identification). Guides you through the selected technique.

Usage

/analyse [target_description]

Examples:

/analyse authentication implementation

/analyse deployment workflow

/analyse codebase for inefficiencies

Variables

TARGET: What to analyze (default: prompt for input)

METHOD: Override auto-selection (gemba, vsm, muda)

Method Selection Logic

Gemba Walk

→ When analyzing:

Code implementation (how feature actually works)

Gap between documentation and reality

Understanding unfamiliar codebase areas

Actual vs. assumed architecture

Value Stream Mapping

→ When analyzing:

Workflows and processes (CI/CD, deployment, development)

Bottlenecks in multi-stage pipelines

Handoffs between teams/systems

Time spent in each process stage

Muda (Waste Analysis)

→ When analyzing:

Code quality and efficiency

Technical debt

Over-engineering or duplication

Resource utilization

Steps

Understand what's being analyzed

Determine best method (or use specified method)

Explain why this method fits

Guide through the analysis

Present findings with actionable insights

Method 1: Gemba Walk

"Go and see" the actual code to understand reality vs. assumptions.

When to Use

Understanding how feature actually works

Code archaeology (legacy systems)

Finding gaps between docs and implementation

Exploring unfamiliar areas before changes

Process

Define scope

What code area to explore

State assumptions

What you think it does

Observe reality

Read actual code

Document findings

:

Entry points

Actual data flow

Surprises (differs from assumptions)

Hidden dependencies

Undocumented behavior

Identify gaps

Documentation vs. reality

Recommend

Update docs, refactor, or accept

Example: Authentication System Gemba Walk

SCOPE: User authentication flow

ASSUMPTIONS (Before):

• JWT tokens stored in localStorage

• Single sign-on via OAuth only

• Session expires after 1 hour

• Password reset via email link

GEMBA OBSERVATIONS (Actual Code):

Entry Point: /api/auth/login (routes/auth.ts:45)

├─> AuthService.authenticate() (services/auth.ts:120)

├─> UserRepository.findByEmail() (db/users.ts:67)

├─> bcrypt.compare() (services/auth.ts:145)

└─> TokenService.generate() (services/token.ts:34)

Actual Flow:

1. Login credentials → POST /api/auth/login

2. Password hashed with bcrypt (10 rounds)

3. JWT generated with 24hr expiry (NOT 1 hour!)

4. Token stored in httpOnly cookie (NOT localStorage)

5. Refresh token in separate cookie (15 days)

6. Session data in Redis (30 days TTL)

SURPRISES:

✗ OAuth not implemented (commented out code found)

✗ Password reset is manual (admin intervention)

✗ Three different session storage mechanisms:

- Redis for session data

- Database for "remember me"

- Cookies for tokens

✗ Legacy endpoint /auth/legacy still active (no auth!)

✗ Admin users bypass rate limiting (security issue)

GAPS:

• Documentation says OAuth, code doesn't have it

• Session expiry inconsistent (docs: 1hr, code: 24hr)

• Legacy endpoint not documented (security risk)

• No mention of "remember me" in docs

RECOMMENDATIONS:

1. HIGH: Secure or remove /auth/legacy endpoint

2. HIGH: Document actual session expiry (24hr)

3. MEDIUM: Clean up or implement OAuth

4. MEDIUM: Consolidate session storage (choose one)

5. LOW: Add rate limiting for admin users

Example: CI/CD Pipeline Gemba Walk

SCOPE: Build and deployment pipeline

ASSUMPTIONS:

• Automated tests run on every commit

• Deploy to staging automatic

• Production deploy requires approval

GEMBA OBSERVATIONS:

Actual Pipeline (.github/workflows/main.yml):

1. On push to main:

├─> Lint (2 min)

├─> Unit tests (5 min) [SKIPPED if "[skip-tests]" in commit]

├─> Build Docker image (15 min)

└─> Deploy to staging (3 min)

2. Manual trigger for production:

├─> Run integration tests (20 min) [ONLY for production!]

├─> Security scan (10 min)

└─> Deploy to production (5 min)

SURPRISES:

✗ Unit tests can be skipped with commit message flag

✗ Integration tests ONLY run for production deploy

✗ Staging deployed without integration tests

✗ No rollback mechanism (manual kubectl commands)

✗ Secrets loaded from .env file (not secrets manager)

✗ Old "hotfix" branch bypasses all checks

GAPS:

• Staging and production have different test coverage

• Documentation doesn't mention test skip flag

• Rollback process not documented or automated

• Security scan results not enforced (warning only)

RECOMMENDATIONS:

1. CRITICAL: Remove test skip flag capability

2. CRITICAL: Migrate secrets to secrets manager

3. HIGH: Run integration tests on staging too

4. HIGH: Delete or secure hotfix branch

5. MEDIUM: Add automated rollback capability

6. MEDIUM: Make security scan blocking

Method 2: Value Stream Mapping

Map workflow stages, measure time/waste, identify bottlenecks.

When to Use

Process optimization (CI/CD, deployment, code review)

Understanding multi-stage workflows

Finding delays and handoffs

Improving cycle time

Process

Identify start and end

Where process begins and ends

Map all steps

Including waiting/handoff time

Measure each step

:

Processing time (work happening)

Waiting time (idle, blocked)

Who/what performs step

Calculate metrics

:

Total lead time

Value-add time vs. waste time

% efficiency (value-add / total time)

Identify bottlenecks

Longest steps, most waiting

Design future state

Optimized flow

Plan improvements

How to achieve future state

Example: Feature Development Value Stream Map

CURRENT STATE: Feature request → Production

Step 1: Requirements Gathering

├─ Processing: 2 days (meetings, writing spec)

├─ Waiting: 3 days (stakeholder review)

└─ Owner: Product Manager

Step 2: Design

├─ Processing: 1 day (mockups, architecture)

├─ Waiting: 2 days (design review, feedback)

└─ Owner: Designer + Architect

Step 3: Development

├─ Processing: 5 days (coding)

├─ Waiting: 2 days (PR review queue)

└─ Owner: Developer

Step 4: Code Review

├─ Processing: 0.5 days (review)

├─ Waiting: 1 day (back-and-forth changes)

└─ Owner: Senior Developer

Step 5: QA Testing

├─ Processing: 2 days (manual testing)

├─ Waiting: 1 day (bug fixes, retest)

└─ Owner: QA Engineer

Step 6: Staging Deployment

├─ Processing: 0.5 days (deploy, smoke test)

├─ Waiting: 2 days (stakeholder UAT)

└─ Owner: DevOps

Step 7: Production Deployment

├─ Processing: 0.5 days (deploy, monitor)

├─ Waiting: 0 days

└─ Owner: DevOps

───────────────────────────────────────

METRICS:

Total Lead Time: 22.5 days

Value-Add Time: 11.5 days (work)

Waste Time: 11 days (waiting)

Efficiency: 51%

BOTTLENECKS:

1. Requirements review wait (3 days)

2. Development time (5 days)

3. Stakeholder UAT wait (2 days)

4. PR review queue (2 days)

WASTE ANALYSIS:

• Waiting for reviews/approvals: 9 days (82% of waste)

• Rework due to unclear requirements: ~1 day

• Manual testing time: 2 days

FUTURE STATE DESIGN:

Changes:

1. Async requirements approval (stakeholders have 24hr SLA)

2. Split large features into smaller increments

3. Automated testing replaces manual QA

4. PR review SLA: 4 hours max

5. Continuous deployment to staging (no approval)

6. Feature flags for production rollout (no wait)

Projected Future State:

Total Lead Time: 9 days (60% reduction)

Value-Add Time: 8 days

Waste Time: 1 day

Efficiency: 89%

IMPLEMENTATION PLAN:

Week 1: Set review SLAs, add feature flags

Week 2: Automate test suite

Week 3: Enable continuous staging deployment

Week 4: Train team on incremental delivery

Example: Incident Response Value Stream Map

CURRENT STATE: Incident detected → Resolution

Step 1: Detection

├─ Processing: 0 min (automated alert)

├─ Waiting: 15 min (until someone sees alert)

└─ System: Monitoring tool

Step 2: Triage

├─ Processing: 10 min (assess severity)

├─ Waiting: 20 min (find right person)

└─ Owner: On-call engineer

Step 3: Investigation

├─ Processing: 45 min (logs, debugging)

├─ Waiting: 30 min (access to production, gather context)

└─ Owner: Engineer + SRE

Step 4: Fix Development

├─ Processing: 60 min (write fix)

├─ Waiting: 15 min (code review)

└─ Owner: Engineer

Step 5: Deployment

├─ Processing: 10 min (hotfix deploy)

├─ Waiting: 5 min (verification)

└─ Owner: SRE

Step 6: Post-Incident

├─ Processing: 20 min (update status, notify)

├─ Waiting: 0 min

└─ Owner: Engineer

───────────────────────────────────────

METRICS:

Total Lead Time: 230 min (3h 50min)

Value-Add Time: 145 min

Waste Time: 85 min (37%)

BOTTLENECKS:

1. Finding right person (20 min)

2. Gaining production access (30 min)

3. Investigation time (45 min)

IMPROVEMENTS:

1. Slack integration for alerts (reduce detection wait)

2. Auto-assign by service owner (no hunt for person)

3. Pre-approved prod access for on-call (reduce wait)

4. Runbooks for common incidents (faster investigation)

5. Automated rollback for deployment incidents

Projected improvement: 230min → 120min (48% faster)

Method 3: Muda (Waste Analysis)

Identify seven types of waste in code and development processes.

When to Use

Code quality audits

Technical debt assessment

Process efficiency improvements

Identifying over-engineering

The 7 Types of Waste (Applied to Software)

1. Overproduction

Building more than needed

Features no one uses

Overly complex solutions

Premature optimization

Unnecessary abstractions

2. Waiting

Idle time

Build/test/deploy time

Code review delays

Waiting for dependencies

Blocked by other teams

3. Transportation

Moving things around

Unnecessary data transformations

API layers with no value add

Copying data between systems

Repeated serialization/deserialization

4. Over-processing

Doing more than necessary

Excessive logging

Redundant validations

Over-normalized databases

Unnecessary computation

5. Inventory

Work in progress

Unmerged branches

Half-finished features

Untriaged bugs

Undeployed code

6. Motion

Unnecessary movement

Context switching

Meetings without purpose

Manual deployments

Repetitive tasks

7. Defects

Rework and bugs
Production bugs
Technical debt
Flaky tests
Incomplete features
Process
Define scope: Codebase area or process Examine for each waste type Quantify impact (time, complexity, cost) Prioritize by impact Propose elimination strategies Example: API Codebase Waste Analysis SCOPE: REST API backend (50K LOC) 1. OVERPRODUCTION Found: • 15 API endpoints with zero usage (last 90 days) • Generic "framework" built for "future flexibility" (unused) • Premature microservices split (2 services, could be 1) • Feature flags for 12 features (10 fully rolled out, flags kept) Impact: 8K LOC maintained for no reason Recommendation: Delete unused endpoints, remove stale flags 2. WAITING Found: • CI pipeline: 45 min (slow Docker builds) • PR review time: avg 2 days • Deployment to staging: manual, takes 1 hour Impact: 2.5 days wasted per feature Recommendation: Cache Docker layers, PR review SLA, automate staging 3. TRANSPORTATION Found: • Data transformed 4 times between DB and API response: DB → ORM → Service → DTO → Serializer • Request/response logged 3 times (middleware, handler, service) • Files uploaded → S3 → CloudFront → Local cache (unnecessary) Impact: 200ms avg response time overhead Recommendation: Reduce transformation layers, consolidate logging 4. OVER-PROCESSING Found: • Every request validates auth token (even cached) • Database queries fetch all columns (SELECT *) • JSON responses include full object graphs (nested 5 levels) • Logs every database query in production (verbose) Impact: 40% higher database load, 3x log storage Recommendation: Cache auth checks, selective fields, trim responses 5. INVENTORY Found: • 23 open PRs (8 abandoned, 6+ months old) • 5 feature branches unmerged (completed but not deployed) • 147 open bugs (42 duplicates, 60 not reproducible) • 12 hotfix commits not backported to main Impact: Context overhead, merge conflicts, lost work Recommendation: Close stale PRs, bug triage, deploy pending features 6. MOTION Found: • Developers switch between 4 tools for one deployment • Manual database migrations (error-prone, slow) • Environment config spread across 6 files • Copy-paste secrets to .env files Impact: 30min per deployment, frequent mistakes Recommendation: Unified deployment tool, automate migrations 7. DEFECTS Found: • 12 production bugs per month • 15% flaky test rate (wasted retry time) • Technical debt in auth module (refactor needed) • Incomplete error handling (crashes instead of graceful) Impact: Customer complaints, rework, downtime Recommendation: Stabilize tests, refactor auth, add error boundaries ─────────────────────────────────────── SUMMARY Total Waste Identified: • Code: 8K LOC doing nothing • Time: 2.5 days per feature • Performance: 200ms overhead per request • Effort: 30min per deployment Priority Fixes (by impact): 1. HIGH: Automate deployments (reduces Motion + Waiting) 2. HIGH: Fix flaky tests (reduces Defects) 3. MEDIUM: Remove unused code (reduces Overproduction) 4. MEDIUM: Optimize data transformations (reduces Transportation) 5. LOW: Triage bug backlog (reduces Inventory) Estimated Recovery: • 20% faster feature delivery • 50% fewer production issues • 30% less operational overhead Notes Method selection is contextual—choose what fits best Can combine methods (Gemba Walk → Muda Analysis) Start with Gemba Walk when unfamiliar with area Use VSM for process optimization Use Muda for efficiency and cleanup All methods should lead to actionable improvements Document findings for organizational learning Consider using /analyse-problem (A3) for comprehensive documentation of findings

kaizen:analyse

安装