- Architecture Design
- Design system architecture and make strategic technical decisions.
- Core Principle
- Good architecture enables change while maintaining simplicity.
- Name
- han-core:architect - Design system architecture and high-level technical strategy
- Synopsis
- /architect [arguments]
- Architecture vs Planning
- Architecture Design (this skill):
- Strategic: "How should the system be structured?"
- Component interactions and boundaries
- Technology and pattern choices
- Long-term implications
- System-level decisions
- Technical Planning:
- Tactical: "How do I implement feature X?"
- Specific implementation tasks
- Execution details
- Short-term focus
- Use /architect when:
- Designing new systems or subsystems
- Significant system change (new subsystem, major refactor)
- Affects multiple components or teams
- Major refactors affecting multiple components
- Technology selection decisions
- Defining system boundaries and interfaces
- Long-term technical strategy needed
- Need to evaluate multiple approaches
- Decisions have broad impact
- Use /plan when:
- Implementing within existing architecture
- Implementing specific feature within existing architecture
- Tactical execution planning
- Breaking down known work
- Architecture is already decided
- Task sequencing and execution
- Architecture Process
- 1. Understand Context
- Business context:
- What problem are we solving?
- Who are the users?
- What are the business goals?
- What are the success metrics?
- Technical context:
- What exists today?
- What constraints exist?
- What must we integrate with?
- What scale must we support?
- Team context:
- What's our expertise?
- What can we maintain?
- What's our velocity?
- 2. Gather Requirements
- Functional requirements:
- What must the system do?
- What are the features?
- What are the user scenarios?
- Non-functional requirements:
- Performance
-
- Response time, throughput
- Scalability
-
- Expected load, growth
- Availability
-
- Uptime requirements
- Security
-
- Compliance, data protection
- Maintainability
-
- Team size, skills
- Cost
- Budget constraints Example:
Requirements
Functional
Users can search products by name/category
Users can add items to cart
Users can checkout and pay
Non-Functional
Search response time < 200ms (p95)
Support 10,000 concurrent users
99.9% uptime
PCI DSS compliant for payments
Team of 5 developers can maintain 3. Identify Constraints Technical constraints: Must use existing authentication system Must integrate with legacy inventory system Database must be PostgreSQL (existing infrastructure) Business constraints: Budget for infrastructure Must support EU data residency Team constraints: Team experienced in Python, less in Go No DevOps specialist on team Remote team across timezones 4. Consider Alternatives Never design in a vacuum - consider options: Example: Data storage choice Option 1: PostgreSQL Pros: Team knows it, ACID guarantees, rich query support Cons: Vertical scaling limits, setup complexity Option 2: MongoDB Pros: Flexible schema, horizontal scaling Cons: Team unfamiliar, eventual consistency Option 3: DynamoDB Pros: Fully managed, auto-scaling Cons: Vendor lock-in, query limitations, cost at scale Decision: PostgreSQL Team expertise outweighs scaling concerns Can re-evaluate if scale becomes issue Faster initial development 5. Design System Structure Define components and their responsibilities: ┌─────────────────────────────────────────────┐ │ Client Apps │ │ (Web, iOS, Android) │ └────────────────┬────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ API Gateway / Load Balancer │ └────────────────┬────────────────────────────┘ │ ┌────────┴────────┐ ▼ ▼ ┌───────────────┐ ┌───────────────┐ │ Auth │ │ Core API │ │ Service │ │ Service │ └───────┬───────┘ └───────┬───────┘ │ │ │ ┌────────┴────────┐ │ ▼ ▼ │ ┌──────────────┐ ┌──────────────┐ │ │ PostgreSQL │ │ Redis │ │ │ (Primary) │ │ (Cache) │ │ └──────────────┘ └──────────────┘ │ ▼ ┌───────────────┐ │ User DB │ └───────────────┘ Component descriptions:
Components
API Gateway ** Responsibility: ** Route requests, rate limiting, authentication ** Technology: ** Nginx ** Dependencies: ** Auth Service, Core API Service ** Scale: ** 2-3 instances behind load balancer
Auth Service ** Responsibility: ** User authentication, session management, JWT issuing ** Technology: ** Python (Flask), PostgreSQL ** API: ** REST ** Scale: ** Stateless, 2-N instances
Core API Service ** Responsibility: ** Business logic, data access, external integrations ** Technology: ** Python (FastAPI), PostgreSQL, Redis ** API: ** REST ** Scale: ** Stateless, 2-N instances
PostgreSQL ** Responsibility: ** Primary data store ** Scale: ** Primary with read replica
Redis ** Responsibility: ** Session storage, caching, rate limiting ** Scale: ** Cluster mode (3 nodes) 6. Define Interfaces API contracts:
API Design
POST /api/auth/login ** Purpose: ** Authenticate user, issue JWT ** Request: ** ```json { "email": "user@example.com", "password": "secure_password" } Response (200): { "token" : "eyJ..." , "user" : { "id" : "123" , "email" : "user@example.com" , "name" : "John Doe" } } Errors: 400: Invalid request 401: Invalid credentials 429: Rate limit exceeded
7. Plan for Failure
What can go wrong? - Database unavailable - External API down - Network partition - High load - Data corruption Mitigation strategies: - Retry with exponential backoff - Circuit breakers for external services - Graceful degradation - Health checks and monitoring - Database backups Example: ```markdown
Failure Scenarios
Database Unavailable
Impact: Cannot read/write data Mitigation: - Read replica failover (automated) - Circuit breaker after 3 failures - Cache serves stale data for 5 minutes - User sees degraded experience message Recovery: Manual failover to replica, fix primary
External Payment API Down
Impact: Cannot process payments Mitigation: - Retry 3 times with exponential backoff - Queue payments for later processing - User notified of delay - Alert on-call engineer Recovery: Process queued payments once API recovers 8. Document Decisions Architecture Decision Record (ADR):
ADR-001: Use PostgreSQL for Primary Database ** Status: ** Accepted ** Date: ** 2024-01-15 ** Deciders: ** Tech Lead, Backend Team
Context We need to choose a primary database for user data, products, and orders. Requirements: - Strong consistency (ACID) - Complex queries (joins, aggregations) - < 200ms query time for 90% of queries - Support 100k users initially
Decision Use PostgreSQL as primary database.
Alternatives Considered
MongoDB
** Pros: ** Flexible schema, horizontal scaling - ** Cons: ** Team unfamiliar, eventual consistency issues - ** Why not: ** Team expertise more valuable than flexibility
DynamoDB
** Pros: ** Managed service, auto-scaling - ** Cons: ** Vendor lock-in, limited query capability, cost - ** Why not: ** Query limitations would hurt development velocity
MySQL
** Pros: ** Similar to PostgreSQL, team knows it - ** Cons: ** Less feature-rich than PostgreSQL - ** Why not: ** PostgreSQL offers JSON support, better full-text search
Consequences ** Positive: ** - Team can be productive immediately - Strong consistency guarantees - Rich query capabilities - JSON support for flexible data ** Negative: ** - Vertical scaling limits (mitigated: can add read replicas) - More complex than managed services (mitigated: use RDS) - Higher operational overhead ** Trade-offs: ** - Chose familiarity over horizontal scaling - Chose rich queries over eventual consistency - Can re-evaluate if scale requirements change
Validation
Team confirmed expertise in PostgreSQL
Load testing shows meets performance requirements
Cost analysis shows acceptable for first year Architecture Document Structure
Architecture Design: [System/Feature Name]
Context
Problem Statement [What business problem are we solving?]
Goals [What are we trying to achieve?]
Non-Goals [What are we explicitly NOT trying to achieve?]
Requirements ** Functional: ** - [Requirement 1] - [Requirement 2] ** Non-Functional: ** - Performance: [e.g., < 200ms response time] - Scalability: [e.g., handle 10k concurrent users] - Security: [e.g., PCI compliance] - Maintainability: [e.g., easy to modify]
Constraints [Technical, time, resource, or business constraints]
Current Architecture [What exists today? What needs to change?]
Proposed Architecture
High-Level Design [Diagram or description of system components] ┌─────────────┐ ┌─────────────┐ ┌──────────────┐ │ Client │────▶│ API │────▶│ Database │ └─────────────┘ └─────────────┘ └──────────────┘ │ ▼ ┌─────────────┐ │ Cache │ └─────────────┘
Components
Component A
Responsibility: [What it does] Interface: [How others interact with it] Dependencies: [What it depends on] Technology: [Implementation stack]
Component B
[Similar structure...]
Data Flow
[How data moves through the system]
API Design
[Key endpoints, schemas, contracts]
Data Model
[Database schema, key entities]
Security Model
[Authentication, authorization, data protection]
Alternative Approaches Considered
Alternative 1: [Approach name]
Pros: - [Advantage 1] - [Advantage 2] Cons: - [Disadvantage 1] - [Disadvantage 2] Why not chosen: [Reasoning]
Alternative 2: [Another approach]
[Similar structure...]
Decision Rationale
Why This Architecture?
[Explain the key decisions and trade-offs]
Trade-offs Accepted
[What we gave up for what benefits]
Assumptions
[What we're assuming to be true]
Implementation Strategy
Phase 1: [Foundation]
[What to build first]
Phase 2: [Core Features]
[Next phase]
Phase 3: [Enhancement]
[Final phase]
Migration Strategy
[If replacing existing system, how to transition?]
Risks & Mitigation
| Risk | Impact | Probability | Mitigation |
|---|---|---|---|
| [Risk 1] | High | Medium | [How to mitigate] |
| [Risk 2] | Medium | Low | [How to mitigate] |
| ## Testing Strategy | |||
| [How will we validate this architecture?] | |||
| ## Monitoring & Observability | |||
| [How will we know if it's working?] | |||
| ## Success Metrics | |||
| [How will we measure success?] | |||
| ## Open Questions | |||
| [What still needs to be resolved?] | |||
| ## References | |||
| [Links to research, related docs, RFCs] | |||
| Architecture Principles | |||
| 1. Simplicity | |||
| Start simple, add complexity only when needed. | |||
| BAD: Microservices from day 1 with 20 services | |||
| GOOD: Start with monolith, split when needed | |||
| Apply YAGNI: | |||
| You Aren't Gonna Need It | |||
| Don't build for hypothetical future | |||
| Add when actually needed | |||
| Simpler is easier to maintain | |||
| 2. Separation of Concerns | |||
| Each component has one clear responsibility. | |||
| GOOD: | |||
| - Auth Service: Authentication only | |||
| - User Service: User profile management | |||
| - Order Service: Order processing | |||
| BAD: | |||
| - God Service: Does everything | |||
| Apply SOLID principles: | |||
| Single Responsibility | |||
| Open/Closed | |||
| Liskov Substitution | |||
| Interface Segregation | |||
| Dependency Inversion | |||
| 3. Loose Coupling | |||
| Components depend on interfaces, not implementations. | |||
| // BAD: Tight coupling | |||
| class | |||
| OrderService | |||
| { | |||
| constructor | |||
| ( | |||
| private | |||
| db | |||
| : | |||
| PostgresDatabase | |||
| ) | |||
| { | |||
| } | |||
| } | |||
| // GOOD: Loose coupling | |||
| class | |||
| OrderService | |||
| { | |||
| constructor | |||
| ( | |||
| private | |||
| db | |||
| : | |||
| Database | |||
| ) | |||
| { | |||
| } | |||
| // Interface | |||
| } | |||
| Benefits: | |||
| Easier to test (mock interface) | |||
| Easier to swap implementations | |||
| Components can evolve independently | |||
| 4. High Cohesion | |||
| Related functionality stays together. | |||
| GOOD: | |||
| user/ | |||
| - create_user.ts | |||
| - update_user.ts | |||
| - delete_user.ts | |||
| - user_repository.ts | |||
| BAD: | |||
| create/ | |||
| - create_user.ts | |||
| - create_order.ts | |||
| update/ | |||
| - update_user.ts | |||
| - update_order.ts | |||
| 5. Explicit Over Implicit | |||
| Make dependencies and contracts clear. | |||
| // BAD: Implicit dependency | |||
| function | |||
| processOrder | |||
| ( | |||
| orderId | |||
| : | |||
| string | |||
| ) | |||
| { | |||
| const | |||
| db | |||
| = | |||
| global | |||
| . | |||
| database | |||
| // Where does this come from? | |||
| // ... | |||
| } | |||
| // GOOD: Explicit dependency | |||
| function | |||
| processOrder | |||
| ( | |||
| orderId | |||
| : | |||
| string | |||
| , | |||
| db | |||
| : | |||
| Database | |||
| , | |||
| logger | |||
| : | |||
| Logger | |||
| ) | |||
| { | |||
| // Dependencies are clear | |||
| } | |||
| 6. Fail Fast | |||
| Detect and report errors early. | |||
| // BAD: Silent failure | |||
| function | |||
| divide | |||
| ( | |||
| a | |||
| : | |||
| number | |||
| , | |||
| b | |||
| : | |||
| number | |||
| ) | |||
| { | |||
| if | |||
| ( | |||
| b | |||
| === | |||
| 0 | |||
| ) | |||
| return | |||
| 0 | |||
| // Wrong! | |||
| return | |||
| a | |||
| / | |||
| b | |||
| } | |||
| // GOOD: Fail fast | |||
| function | |||
| divide | |||
| ( | |||
| a | |||
| : | |||
| number | |||
| , | |||
| b | |||
| : | |||
| number | |||
| ) | |||
| { | |||
| if | |||
| ( | |||
| b | |||
| === | |||
| 0 | |||
| ) | |||
| { | |||
| throw | |||
| new | |||
| Error | |||
| ( | |||
| 'Division by zero' | |||
| ) | |||
| } | |||
| return | |||
| a | |||
| / | |||
| b | |||
| } | |||
| 7. Design for Testability | |||
| Make it easy to test. | |||
| // BAD: Hard to test | |||
| class | |||
| OrderService | |||
| { | |||
| processOrder | |||
| ( | |||
| orderId | |||
| : | |||
| string | |||
| ) | |||
| { | |||
| const | |||
| db | |||
| = | |||
| new | |||
| PostgresDatabase | |||
| ( | |||
| ) | |||
| // Can't mock | |||
| const | |||
| api | |||
| = | |||
| new | |||
| PaymentAPI | |||
| ( | |||
| ) | |||
| // Can't mock | |||
| // ... | |||
| } | |||
| } | |||
| // GOOD: Easy to test | |||
| class | |||
| OrderService | |||
| { | |||
| constructor | |||
| ( | |||
| private | |||
| db | |||
| : | |||
| Database | |||
| , | |||
| // Can inject mock | |||
| private | |||
| api | |||
| : | |||
| PaymentAPI | |||
| // Can inject mock | |||
| ) | |||
| { | |||
| } | |||
| processOrder | |||
| ( | |||
| orderId | |||
| : | |||
| string | |||
| ) | |||
| { | |||
| // ... | |||
| } | |||
| } | |||
| Common Architecture Patterns | |||
| Layered Architecture | |||
| ┌─────────────────────┐ | |||
| │ Presentation │ (UI, API controllers) | |||
| ├─────────────────────┤ | |||
| │ Business Logic │ (Domain, services) | |||
| ├─────────────────────┤ | |||
| │ Data Access │ (Repositories, ORMs) | |||
| ├─────────────────────┤ | |||
| │ Database │ (Storage) | |||
| └─────────────────────┘ | |||
| When to use: | |||
| Simple to moderate complexity | |||
| Hexagonal Architecture (Ports & Adapters) | |||
| ┌───────────────────────┐ | |||
| │ External Systems │ | |||
| │ (UI, DB, APIs) │ | |||
| └──────────┬────────────┘ | |||
| │ | |||
| ┌──────────▼────────────┐ | |||
| │ Adapters │ (Implementation) | |||
| │ (REST, PostgreSQL) │ | |||
| └──────────┬────────────┘ | |||
| │ | |||
| ┌──────────▼────────────┐ | |||
| │ Ports │ (Interfaces) | |||
| │ (IUserRepo, IAuth) │ | |||
| └──────────┬────────────┘ | |||
| │ | |||
| ┌──────────▼────────────┐ | |||
| │ Core Domain │ (Business logic) | |||
| │ (Pure logic) │ | |||
| └───────────────────────┘ | |||
| When to use: | |||
| Want to isolate business logic, multiple frontends | |||
| Microservices | |||
| ┌─────────┐ ┌─────────┐ ┌─────────┐ | |||
| │ User │ │ Order │ │ Payment │ | |||
| │ Service │ │ Service │ │ Service │ | |||
| └────┬────┘ └────┬────┘ └────┬────┘ | |||
| │ │ │ | |||
| └────────────┴────────────┘ | |||
| │ | |||
| ┌───────▼────────┐ | |||
| │ Message Bus │ | |||
| │ (Event-driven)│ | |||
| └────────────────┘ | |||
| When to use: | |||
| Large team, need independent deploy, clear boundaries | |||
| Avoid when: | |||
| Small team, unclear boundaries, early stage | |||
| Event-Driven Architecture | |||
| ┌─────────┐ ┌─────────────┐ ┌─────────┐ | |||
| │Producer │──────▶│ Event Bus │──────▶│Consumer │ | |||
| └─────────┘ └─────────────┘ └─────────┘ | |||
| When to use: | |||
| Async processing, decoupled systems, audit trails | |||
| Architectural Patterns to Consider | |||
| System Patterns: | |||
| Layered architecture | |||
| Microservices vs monolith | |||
| Event-driven architecture | |||
| CQRS (Command Query Responsibility Segregation) | |||
| Hexagonal architecture | |||
| Data Patterns: | |||
| Database per service | |||
| Shared database | |||
| Event sourcing | |||
| CQRS | |||
| Cache-aside | |||
| Integration Patterns: | |||
| API Gateway | |||
| Service mesh | |||
| Message queue | |||
| Pub/sub | |||
| GraphQL federation | |||
| Anti-Patterns | |||
| Premature Optimization | |||
| Don't optimize for scale you don't have. | |||
| BAD: Build microservices for 100 users | |||
| GOOD: Start with monolith, split when needed | |||
| Resume-Driven Architecture | |||
| Don't choose technology to pad resume. | |||
| BAD: "I want to learn Kubernetes, let's use it" | |||
| GOOD: "Kubernetes fits our scale needs" | |||
| Distributed Monolith | |||
| Microservices that are tightly coupled. | |||
| BAD: Service A can't deploy without Service B | |||
| GOOD: Services are independently deployable | |||
| Big Ball of Mud | |||
| No structure, everything depends on everything. | |||
| BAD: Any code can call any other code | |||
| GOOD: Clear layers and boundaries | |||
| Analysis Paralysis | |||
| Over-analyzing, never shipping. | |||
| BAD: Spend months on perfect architecture | |||
| GOOD: Design enough to start, iterate | |||
| Architecture Review Checklist | |||
| Business goals clearly understood | |||
| Functional requirements documented | |||
| Non-functional requirements defined | |||
| Constraints identified | |||
| Multiple alternatives considered | |||
| Trade-offs explicitly stated | |||
| Component responsibilities clear | |||
| Interfaces well-defined | |||
| Data flow documented | |||
| Failure scenarios planned for | |||
| Security model defined | |||
| Scalability addressed | |||
| Testability designed in | |||
| Decisions documented (ADRs) | |||
| Implementation phases outlined | |||
| Team can implement and maintain | |||
| Success metrics defined | |||
| Examples | |||
| When the user says: | |||
| "Design the architecture for our multi-tenant system" | |||
| "How should we structure our microservices?" | |||
| "Plan the technical approach for real-time notifications" | |||
| "Design the data model for our marketplace" | |||
| "Create architecture for migrating from monolith to services" | |||
| Integration with Other Skills | |||
| Apply | |||
| solid-principles | |||
| - Guide component design | |||
| Apply | |||
| simplicity-principles | |||
| - KISS, YAGNI | |||
| Apply | |||
| orthogonality-principle | |||
| - Independent components | |||
| Apply | |||
| structural-design-principles | |||
| - Composition patterns | |||
| Use | |||
| technical-planning | |||
| - For implementation after design | |||
| Notes | |||
| Use TaskCreate to track architecture design steps | |||
| Apply all relevant design principle skills | |||
| Create diagrams (ASCII art or reference drawing tools) | |||
| Architecture evolves - document decisions and changes | |||
| Consider using /plan for detailed implementation after architecture is approved | |||
| Archive decisions as ADRs for future reference | |||
| Remember | |||
| Simplicity first | |||
| - Start simple, add complexity when needed | |||
| Document decisions | |||
| - Future you will thank you | |||
| Consider alternatives | |||
| - Never the first idea only | |||
| State trade-offs | |||
| - Every decision has consequences | |||
| Design for change | |||
| - Systems evolve | |||
| The best architecture is the one that's simple enough to ship and flexible enough to evolve. |