system-architecture

安装量: 57
排名: #13052

安装

npx skills add https://github.com/yennanliu/cs_basics --skill system-architecture

Use this Skill when:

  • Designing distributed systems

  • Writing system design documentation

  • Preparing for system design interviews

  • Creating architecture diagrams

  • Analyzing trade-offs between design choices

  • Reviewing or improving existing system designs

System Design Framework

1. Requirements Gathering (5-10 minutes)

Functional Requirements:

  • What are the core features?

  • What actions can users perform?

  • What are the inputs and outputs?

Non-Functional Requirements:

  • Scale: How many users? How much data?

  • Performance: Latency requirements? (p50, p95, p99)

  • Availability: What uptime is needed? (99.9%, 99.99%)

  • Consistency: Strong or eventual consistency?

Constraints:

  • Budget limitations

  • Technology stack constraints

  • Team expertise

  • Timeline

Example Questions:

- How many daily active users?
- What's the read:write ratio?
- What's the average data size?
- What's the peak load vs average load?
- Do we need real-time updates?
- Can we have data loss?

2. Capacity Estimation (Back-of-the-envelope)

Calculate:

Traffic:
- DAU = 100M users
- Each user makes 10 requests/day
- QPS = 100M * 10 / 86400 ≈ 11,574 QPS
- Peak QPS = 2-3x average ≈ 30,000 QPS

Storage:
- 100M users * 1KB per user = 100GB
- With 3x replication = 300GB
- Growth: 300GB * 365 days = 109.5TB/year

Bandwidth:
- QPS * average request size
- 11,574 * 10KB = 115.74MB/s

Memory/Cache:

  • 80-20 rule: 20% of data gets 80% of traffic

  • Cache = 20% of total data for hot data

3. High-Level Design

Core Components:

  • Client Layer (Web, Mobile, Desktop)

  • API Gateway / Load Balancer

  • Application Servers (Business logic)

  • Cache Layer (Redis, Memcached)

  • Database (SQL, NoSQL, or both)

  • Message Queue (Kafka, RabbitMQ)

  • Object Storage (S3, GCS)

  • CDN (CloudFront, Akamai)

Draw Architecture:

[Clients] → [CDN]
            ↓
        [Load Balancer]
            ↓
    [Application Servers]
        ↙     ↓     ↘
   [Cache] [DB] [Queue] → [Workers]
                            ↓
                      [Object Storage]

4. Database Design

SQL vs NoSQL Decision:

Use SQL when:

  • ACID transactions required

  • Complex queries with JOINs

  • Structured data with relationships

  • Examples: PostgreSQL, MySQL

Use NoSQL when:

  • Massive scale (horizontal scaling)

  • Flexible schema

  • High write throughput

  • Examples: Cassandra, DynamoDB, MongoDB

Sharding Strategy:

  • Hash-based: user_id % num_shards

  • Range-based: Users 1-100M on shard 1

  • Geographic: US users on US shard

  • Consistent hashing: For even distribution

Schema Design:

-- Example: URL Shortener
CREATE TABLE urls (
    id BIGSERIAL PRIMARY KEY,
    short_url VARCHAR(10) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    user_id BIGINT,
    created_at TIMESTAMP DEFAULT NOW(),
    expires_at TIMESTAMP,
    click_count INT DEFAULT 0,
    INDEX (short_url),
    INDEX (user_id)
);

5. Deep Dive Components

Caching Strategy:

  • Cache-Aside: App reads from cache, loads from DB on miss

  • Write-Through: Write to cache and DB together

  • Write-Behind: Write to cache, async write to DB

Eviction Policies:

  • LRU (Least Recently Used) - Most common

  • LFU (Least Frequently Used)

  • TTL (Time To Live)

Load Balancing:

  • Round Robin: Simple, equal distribution

  • Least Connections: Route to least busy server

  • Consistent Hashing: Minimize redistribution

  • Weighted: Based on server capacity

Message Queue Patterns:

  • Pub/Sub: One-to-many (notifications)

  • Work Queue: Task distribution (job processing)

  • Fan-out: Broadcast to multiple queues

6. Scalability Patterns

Horizontal Scaling:

  • Add more servers

  • Use load balancers

  • Stateless application servers

  • Session stored in cache/DB

Vertical Scaling:

  • Add more CPU/RAM to servers

  • Limited by hardware

  • Simpler but has limits

Microservices:

Monolith:
[Single App] → [DB]

Microservices:
[User Service] → [User DB]
[Post Service] → [Post DB]
[Feed Service] → [Feed DB]

Benefits:

  • Independent scaling

  • Technology flexibility

  • Fault isolation

Drawbacks:

  • Increased complexity

  • Network latency

  • Distributed transactions

7. Reliability & Availability

Replication:

  • Master-Slave: One writer, multiple readers

  • Master-Master: Multiple writers (conflict resolution needed)

  • Multi-region: Geographic redundancy

Failover:

  • Active-Passive: Standby server takes over

  • Active-Active: Both servers handle traffic

Rate Limiting:

  • Token bucket algorithm

  • Leaky bucket algorithm

  • Fixed window counter

  • Sliding window log

Circuit Breaker:

States:
Closed → Normal operation
Open → Reject requests immediately
Half-Open → Test if service recovered

8. Common System Design Patterns

Content Delivery:

  • Use CDN for static assets

  • Geo-distributed edge servers

  • Cache at edge locations

Data Consistency:

  • Strong Consistency: Read reflects latest write (ACID)

  • Eventual Consistency: Reads eventually reflect write (BASE)

  • CAP Theorem: Choose 2 of 3: Consistency, Availability, Partition Tolerance

API Design:

RESTful:
GET    /api/users/{id}
POST   /api/users
PUT    /api/users/{id}
DELETE /api/users/{id}

GraphQL:
query {
  user(id: "123") {
    name
    posts {
      title
    }
  }
}

9. System Design Template

Use this structure (based on system_design/00_template.md):

# {System Name}

## 1. Requirements
### Functional
- [List core features]

### Non-Functional
- Scale: [Users, QPS, Data]
- Performance: [Latency requirements]
- Availability: [Uptime target]

## 2. Capacity Estimation
- Traffic: [QPS calculations]
- Storage: [Data size, growth]
- Bandwidth: [Network requirements]

## 3. API Design

[endpoint] - [description]

## 4. High-Level Architecture
[Diagram]

## 5. Database Schema
[Tables and relationships]

## 6. Detailed Design
### Component 1
[Deep dive]

### Component 2
[Deep dive]

## 7. Scalability
[How to scale each component]

## 8. Trade-offs
[Decisions and alternatives]

10. Real-World Examples

Reference case studies in system_design/:

  • Netflix: Video streaming, recommendation

  • Twitter: Timeline, tweet storage, trending

  • Uber: Real-time matching, location tracking

  • Instagram: Image storage, feed generation

  • WhatsApp: Message delivery, presence

Common Patterns:

  • News Feed: Fan-out on write vs fan-out on read

  • Rate Limiter: Token bucket with Redis

  • URL Shortener: Base62 encoding, hash collision

  • Chat System: WebSocket, message queue

  • Notification: Push notification service, APNs/FCM

Interview Tips

Time Management:

  • Requirements: 10%

  • High-level design: 25%

  • Deep dive: 50%

  • Wrap up: 15%

Communication:

  • Think out loud

  • Ask clarifying questions

  • Discuss trade-offs

  • Acknowledge limitations

What interviewers look for:

  • Problem-solving approach

  • Technical depth

  • Trade-off analysis

  • Scale awareness

  • Communication skills

Common Mistakes to Avoid

  • Jumping to solution without requirements

  • Over-engineering simple problems

  • Under-estimating scale requirements

  • Ignoring single points of failure

  • Not considering monitoring/alerting

  • Forgetting about data consistency

  • Missing security considerations

Project Context

  • Templates in system_design/00_template.md

  • Case studies in system_design/*.md

  • Reference materials in doc/system_design/

  • Follow the established documentation pattern

返回排行榜