database-schema-design

安装量: 11.6K
排名: #181

安装

npx skills add https://github.com/supercent-io/skills-template --skill database-schema-design
Database Schema Design
When to use this skill
Lists specific situations where this skill should be triggered:
New Project
Database schema design for a new application
Schema Refactoring
Redesigning an existing schema for performance or scalability
Relationship Definition
Implementing 1:1, 1:N, N:M relationships between tables
Migration
Safely applying schema changes
Performance Issues
Index and schema optimization to resolve slow queries
Input Format
The required and optional input information to collect from the user:
Required Information
Database Type
PostgreSQL, MySQL, MongoDB, SQLite, etc.
Domain Description
What data will be stored (e.g., e-commerce, blog, social media)
Key Entities
Core data objects (e.g., User, Product, Order)
Optional Information
Expected Data Volume
Small (<10K rows), Medium (10K-1M), Large (>1M) (default: Medium)
Read/Write Ratio
Read-heavy, Write-heavy, Balanced (default: Balanced)
Transaction Requirements
Whether ACID is required (default: true)
Sharding/Partitioning
Whether large data distribution is needed (default: false) Input Example Design a database for an e-commerce platform: - DB: PostgreSQL - Entities: User, Product, Order, Review - Relationships: - A User can have multiple Orders - An Order contains multiple Products (N:M) - A Review is linked to a User and a Product - Expected data: 100,000 users, 10,000 products - Read-heavy (frequent product lookups) Instructions Specifies the step-by-step task sequence to follow precisely. Step 1: Define Entities and Attributes Identify core data objects and their attributes. Tasks : Extract nouns from business requirements → entities List each entity's attributes (columns) Determine data types (VARCHAR, INTEGER, TIMESTAMP, JSON, etc.) Designate Primary Keys (UUID vs Auto-increment ID) Example (E-commerce): Users - id: UUID PRIMARY KEY - email: VARCHAR(255) UNIQUE NOT NULL - username: VARCHAR(50) UNIQUE NOT NULL - password_hash: VARCHAR(255) NOT NULL - created_at: TIMESTAMP DEFAULT NOW() - updated_at: TIMESTAMP DEFAULT NOW() Products - id: UUID PRIMARY KEY - name: VARCHAR(255) NOT NULL - description: TEXT - price: DECIMAL(10, 2) NOT NULL - stock: INTEGER DEFAULT 0 - category_id: UUID REFERENCES Categories(id) - created_at: TIMESTAMP DEFAULT NOW() Orders - id: UUID PRIMARY KEY - user_id: UUID REFERENCES Users(id) - total_amount: DECIMAL(10, 2) NOT NULL - status: VARCHAR(20) DEFAULT 'pending' - created_at: TIMESTAMP DEFAULT NOW() OrderItems (Junction table) - id: UUID PRIMARY KEY - order_id: UUID REFERENCES Orders(id) ON DELETE CASCADE - product_id: UUID REFERENCES Products(id) - quantity: INTEGER NOT NULL - price: DECIMAL(10, 2) NOT NULL Step 2: Design Relationships and Normalization Define relationships between tables and apply normalization. Tasks : 1:1 relationship: Foreign Key + UNIQUE constraint 1:N relationship: Foreign Key N:M relationship: Create junction table Determine normalization level (1NF ~ 3NF) Decision Criteria : OLTP systems → normalize to 3NF (data integrity) OLAP/analytics systems → denormalization allowed (query performance) Read-heavy → minimize JOINs with partial denormalization Write-heavy → full normalization to eliminate redundancy Example (ERD Mermaid): erDiagram Users ||--o{ Orders : places Orders ||--|{ OrderItems : contains Products ||--o{ OrderItems : "ordered in" Categories ||--o{ Products : categorizes Users ||--o{ Reviews : writes Products ||--o{ Reviews : "reviewed by" Users { uuid id PK string email UK string username UK string password_hash timestamp created_at } Products { uuid id PK string name decimal price int stock uuid category_id FK } Orders { uuid id PK uuid user_id FK decimal total_amount string status timestamp created_at } OrderItems { uuid id PK uuid order_id FK uuid product_id FK int quantity decimal price } Step 3: Establish Indexing Strategy Design indexes for query performance. Tasks : Primary Keys automatically create indexes Columns frequently used in WHERE clauses → add indexes Foreign Keys used in JOINs → indexes Consider composite indexes (WHERE col1 = ? AND col2 = ?) UNIQUE indexes (email, username, etc.) Checklist : Indexes on frequently queried columns Indexes on Foreign Key columns Composite index order optimized (high selectivity columns first) Avoid excessive indexes (degrades INSERT/UPDATE performance) Example (PostgreSQL): -- Primary Keys (auto-indexed) CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , email VARCHAR ( 255 ) UNIQUE NOT NULL , -- UNIQUE = auto-indexed username VARCHAR ( 50 ) UNIQUE NOT NULL , password_hash VARCHAR ( 255 ) NOT NULL , created_at TIMESTAMP DEFAULT NOW ( ) , updated_at TIMESTAMP DEFAULT NOW ( ) ) ; -- Foreign Keys + explicit indexes CREATE TABLE orders ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , user_id UUID NOT NULL REFERENCES users ( id ) ON DELETE CASCADE , total_amount DECIMAL ( 10 , 2 ) NOT NULL , status VARCHAR ( 20 ) DEFAULT 'pending' , created_at TIMESTAMP DEFAULT NOW ( ) ) ; CREATE INDEX idx_orders_user_id ON orders ( user_id ) ; CREATE INDEX idx_orders_status ON orders ( status ) ; CREATE INDEX idx_orders_created_at ON orders ( created_at ) ; -- Composite index (status and created_at frequently queried together) CREATE INDEX idx_orders_status_created ON orders ( status , created_at DESC ) ; -- Products table CREATE TABLE products ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , name VARCHAR ( 255 ) NOT NULL , description TEXT , price DECIMAL ( 10 , 2 ) NOT NULL CHECK ( price

= 0 ) , stock INTEGER DEFAULT 0 CHECK ( stock = 0 ) , category_id UUID REFERENCES categories ( id ) , created_at TIMESTAMP DEFAULT NOW ( ) ) ; CREATE INDEX idx_products_category ON products ( category_id ) ; CREATE INDEX idx_products_price ON products ( price ) ; -- price range search CREATE INDEX idx_products_name ON products ( name ) ; -- product name search -- Full-text search (PostgreSQL) CREATE INDEX idx_products_name_fts ON products USING GIN ( to_tsvector ( 'english' , name ) ) ; CREATE INDEX idx_products_description_fts ON products USING GIN ( to_tsvector ( 'english' , description ) ) ; Step 4: Set Up Constraints and Triggers Add constraints to ensure data integrity. Tasks : NOT NULL: required columns UNIQUE: columns that must be unique CHECK: value range constraints (e.g., price >= 0) Foreign Key + CASCADE option Set default values Example : CREATE TABLE products ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , name VARCHAR ( 255 ) NOT NULL , price DECIMAL ( 10 , 2 ) NOT NULL CHECK ( price = 0 ) , stock INTEGER DEFAULT 0 CHECK ( stock = 0 ) , discount_percent INTEGER CHECK ( discount_percent = 0 AND discount_percent <= 100 ) , category_id UUID REFERENCES categories ( id ) ON DELETE SET NULL , created_at TIMESTAMP DEFAULT NOW ( ) , updated_at TIMESTAMP DEFAULT NOW ( ) ) ; -- Trigger: auto-update updated_at CREATE OR REPLACE FUNCTION update_updated_at_column ( ) RETURNS TRIGGER AS $$ BEGIN NEW . updated_at = NOW ( ) ; RETURN NEW ; END ; $$ LANGUAGE plpgsql ; CREATE TRIGGER update_products_updated_at BEFORE UPDATE ON products FOR EACH ROW EXECUTE FUNCTION update_updated_at_column ( ) ; Step 5: Write Migration Scripts Write migrations that safely apply schema changes. Tasks : UP migration: apply changes DOWN migration: rollback Wrap in transactions Prevent data loss (use ALTER TABLE carefully) Example (SQL migration): -- migrations/001_create_initial_schema.up.sql BEGIN ; CREATE EXTENSION IF NOT EXISTS "uuid-ossp" ; CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , email VARCHAR ( 255 ) UNIQUE NOT NULL , username VARCHAR ( 50 ) UNIQUE NOT NULL , password_hash VARCHAR ( 255 ) NOT NULL , created_at TIMESTAMP DEFAULT NOW ( ) , updated_at TIMESTAMP DEFAULT NOW ( ) ) ; CREATE TABLE categories ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , name VARCHAR ( 100 ) UNIQUE NOT NULL , parent_id UUID REFERENCES categories ( id ) ) ; CREATE TABLE products ( id UUID PRIMARY KEY DEFAULT gen_random_uuid ( ) , name VARCHAR ( 255 ) NOT NULL , description TEXT , price DECIMAL ( 10 , 2 ) NOT NULL CHECK ( price = 0 ) , stock INTEGER DEFAULT 0 CHECK ( stock = 0 ) , category_id UUID REFERENCES categories ( id ) , created_at TIMESTAMP DEFAULT NOW ( ) , updated_at TIMESTAMP DEFAULT NOW ( ) ) ; CREATE INDEX idx_products_category ON products ( category_id ) ; CREATE INDEX idx_products_price ON products ( price ) ; COMMIT ; -- migrations/001_create_initial_schema.down.sql BEGIN ; DROP TABLE IF EXISTS products CASCADE ; DROP TABLE IF EXISTS categories CASCADE ; DROP TABLE IF EXISTS users CASCADE ; COMMIT ; Output format Defines the exact format that deliverables should follow. Basic Structure project/ ├── database/ │ ├── schema.sql # full schema │ ├── migrations/ │ │ ├── 001_create_users.up.sql │ │ ├── 001_create_users.down.sql │ │ ├── 002_create_products.up.sql │ │ └── 002_create_products.down.sql │ ├── seeds/ │ │ └── sample_data.sql # test data │ └── docs/ │ ├── ERD.md # Mermaid ERD diagram │ └── SCHEMA.md # schema documentation └── README.md ERD Diagram (Mermaid Format)

Database Schema

Entity Relationship Diagram ```mermaid erDiagram Users ||--o{ Orders : places Orders ||--|{ OrderItems : contains Products ||--o{ OrderItems : "ordered in" Users { uuid id PK string email UK string username UK } Products { uuid id PK string name decimal price } ```

Table Descriptions

users

**
Purpose
**

Store user account information

**
Indexes
**

email, username

**
Estimated rows
**
100,000

products

**
Purpose
**

Product catalog

**
Indexes
**

category_id, price, name

**
Estimated rows
**
10,000
Constraints
Specifies mandatory rules and prohibited actions.
Mandatory Rules (MUST)
Primary Key Required
Define a Primary Key on every table
Unique record identification
Ensures referential integrity
Explicit Foreign Keys
Tables with relationships must define Foreign Keys
Specify ON DELETE CASCADE/SET NULL options
Prevent orphan records
Use NOT NULL Appropriately
Required columns must be NOT NULL
Clearly specify nullable vs. non-nullable
Providing defaults is recommended
Prohibited Actions (MUST NOT)
Avoid EAV Pattern Abuse
Use the Entity-Attribute-Value pattern only in special cases
Query complexity increases dramatically
Performance degradation
Excessive Denormalization
Be careful when denormalizing for performance
Data consistency issues
Risk of update anomalies
No Plaintext Storage of Sensitive Data
Never store passwords, card numbers, etc. in plaintext
Hashing/encryption is mandatory
Legal liability issues
Security Rules
Principle of Least Privilege
Grant only the necessary permissions to application DB accounts
SQL Injection Prevention
Use Prepared Statements / Parameterized Queries
Encrypt Sensitive Columns
Consider encrypting personally identifiable information at rest
Examples
Demonstrates how to apply the skill through real-world use cases.
Example 1: Blog Platform Schema
Situation
Database design for a Medium-style blog platform
User Request
:
Design a PostgreSQL schema for a blog platform:
- Users can write multiple posts
- Posts can have multiple tags (N:M)
- Users can like and bookmark posts
- Comment feature (with nested replies)
Final Result
:
-- Users
CREATE
TABLE
users
(
id UUID
PRIMARY
KEY
DEFAULT
gen_random_uuid
(
)
,
email
VARCHAR
(
255
)
UNIQUE
NOT
NULL
,
username
VARCHAR
(
50
)
UNIQUE
NOT
NULL
,
bio
TEXT
,
avatar_url
VARCHAR
(
500
)
,
created_at
TIMESTAMP
DEFAULT
NOW
(
)
)
;
-- Posts
CREATE
TABLE
posts
(
id UUID
PRIMARY
KEY
DEFAULT
gen_random_uuid
(
)
,
author_id UUID
NOT
NULL
REFERENCES
users
(
id
)
ON
DELETE
CASCADE
,
title
VARCHAR
(
255
)
NOT
NULL
,
slug
VARCHAR
(
255
)
UNIQUE
NOT
NULL
,
content
TEXT
NOT
NULL
,
published_at
TIMESTAMP
,
created_at
TIMESTAMP
DEFAULT
NOW
(
)
,
updated_at
TIMESTAMP
DEFAULT
NOW
(
)
)
;
CREATE
INDEX
idx_posts_author
ON
posts
(
author_id
)
;
CREATE
INDEX
idx_posts_published
ON
posts
(
published_at
)
;
CREATE
INDEX
idx_posts_slug
ON
posts
(
slug
)
;
-- Tags
CREATE
TABLE
tags
(
id UUID
PRIMARY
KEY
DEFAULT
gen_random_uuid
(
)
,
name
VARCHAR
(
50
)
UNIQUE
NOT
NULL
,
slug
VARCHAR
(
50
)
UNIQUE
NOT
NULL
)
;
-- Post-Tag relationship (N:M)
CREATE
TABLE
post_tags
(
post_id UUID
REFERENCES
posts
(
id
)
ON
DELETE
CASCADE
,
tag_id UUID
REFERENCES
tags
(
id
)
ON
DELETE
CASCADE
,
PRIMARY
KEY
(
post_id
,
tag_id
)
)
;
CREATE
INDEX
idx_post_tags_post
ON
post_tags
(
post_id
)
;
CREATE
INDEX
idx_post_tags_tag
ON
post_tags
(
tag_id
)
;
-- Likes
CREATE
TABLE
post_likes
(
user_id UUID
REFERENCES
users
(
id
)
ON
DELETE
CASCADE
,
post_id UUID
REFERENCES
posts
(
id
)
ON
DELETE
CASCADE
,
created_at
TIMESTAMP
DEFAULT
NOW
(
)
,
PRIMARY
KEY
(
user_id
,
post_id
)
)
;
-- Bookmarks
CREATE
TABLE
post_bookmarks
(
user_id UUID
REFERENCES
users
(
id
)
ON
DELETE
CASCADE
,
post_id UUID
REFERENCES
posts
(
id
)
ON
DELETE
CASCADE
,
created_at
TIMESTAMP
DEFAULT
NOW
(
)
,
PRIMARY
KEY
(
user_id
,
post_id
)
)
;
-- Comments (self-referencing for nested comments)
CREATE
TABLE
comments
(
id UUID
PRIMARY
KEY
DEFAULT
gen_random_uuid
(
)
,
post_id UUID
NOT
NULL
REFERENCES
posts
(
id
)
ON
DELETE
CASCADE
,
author_id UUID
NOT
NULL
REFERENCES
users
(
id
)
ON
DELETE
CASCADE
,
parent_comment_id UUID
REFERENCES
comments
(
id
)
ON
DELETE
CASCADE
,
content
TEXT
NOT
NULL
,
created_at
TIMESTAMP
DEFAULT
NOW
(
)
,
updated_at
TIMESTAMP
DEFAULT
NOW
(
)
)
;
CREATE
INDEX
idx_comments_post
ON
comments
(
post_id
)
;
CREATE
INDEX
idx_comments_author
ON
comments
(
author_id
)
;
CREATE
INDEX
idx_comments_parent
ON
comments
(
parent_comment_id
)
;
Example 2: MongoDB Schema (NoSQL)
Situation
MongoDB schema for a real-time chat app
User Request
:
Design a MongoDB schema for a real-time chat app.
Reads are very frequent, and message history needs to be retrieved quickly.
Final Result
:
// users collection
{
_id
:
ObjectId
,
username
:
String
,
// indexed, unique
email
:
String
,
// indexed, unique
avatar_url
:
String
,
status
:
String
,
// 'online', 'offline', 'away'
last_seen
:
Date
,
created_at
:
Date
}
// conversations collection (denormalized - read-optimized)
{
_id
:
ObjectId
,
participants
:
[
// indexed
{
user_id
:
ObjectId
,
username
:
String
,
avatar_url
:
String
}
]
,
last_message
:
{
// denormalized for fast recent-message retrieval
content
:
String
,
sender_id
:
ObjectId
,
sent_at
:
Date
}
,
unread_counts
:
{
// unread message count per participant
"user_id_1"
:
5
,
"user_id_2"
:
0
}
,
created_at
:
Date
,
updated_at
:
Date
}
// messages collection
{
_id
:
ObjectId
,
conversation_id
:
ObjectId
,
// indexed
sender_id
:
ObjectId
,
content
:
String
,
attachments
:
[
{
type
:
String
,
// 'image', 'file', 'video'
url
:
String
,
filename
:
String
}
]
,
read_by
:
[
ObjectId
]
,
// array of user IDs who have read the message
sent_at
:
Date
,
// indexed
edited_at
:
Date
}
// Indexes
db
.
users
.
createIndex
(
{
username
:
1
}
,
{
unique
:
true
}
)
;
db
.
users
.
createIndex
(
{
email
:
1
}
,
{
unique
:
true
}
)
;
db
.
conversations
.
createIndex
(
{
"participants.user_id"
:
1
}
)
;
db
.
conversations
.
createIndex
(
{
updated_at
:
-
1
}
)
;
db
.
messages
.
createIndex
(
{
conversation_id
:
1
,
sent_at
:
-
1
}
)
;
db
.
messages
.
createIndex
(
{
sender_id
:
1
}
)
;
Design Highlights
:
Denormalization for read optimization (embedding last_message)
Indexes on frequently accessed fields
Using array fields (participants, read_by)
Best practices
Quality Improvement
Naming Convention Consistency
Use snake_case for table/column names
users, post_tags, created_at
Be consistent with plurals/singulars (tables plural, columns singular, etc.)
Consider Soft Delete
Use logical deletion instead of physical deletion for important data
deleted_at TIMESTAMP (NULL = active, NOT NULL = deleted)
Allows recovery of accidentally deleted data
Audit trail
Timestamps Required
Include created_at and updated_at in most tables
Data tracking and debugging
Time-series analysis
Efficiency Improvements
Partial Indexes
Minimize index size with conditional indexes
CREATE
INDEX
idx_posts_published
ON
posts
(
published_at
)
WHERE
published_at
IS
NOT
NULL
;
Materialized Views
Cache complex aggregate queries as Materialized Views
Partitioning
Partition large tables by date/range
Common Issues
Issue 1: N+1 Query Problem
Symptom
Multiple DB calls when a single query would suffice
Cause
Individual lookups in a loop without JOINs
Solution
:
-- ❌ Bad example: N+1 queries
SELECT
*
FROM
posts
;
-- 1 time
-- for each post
SELECT
*
FROM
users
WHERE
id
=
?
;
-- N times
-- ✅ Good example: 1 query
SELECT
posts
.
*
,
users
.
username
,
users
.
avatar_url
FROM
posts
JOIN
users
ON
posts
.
author_id
=
users
.
id
;
Issue 2: Slow JOINs Due to Unindexed Foreign Keys
Symptom
JOIN queries are very slow
Cause
Missing index on Foreign Key column
Solution
:
CREATE
INDEX
idx_orders_user_id
ON
orders
(
user_id
)
;
CREATE
INDEX
idx_order_items_order_id
ON
order_items
(
order_id
)
;
CREATE
INDEX
idx_order_items_product_id
ON
order_items
(
product_id
)
;
Issue 3: UUID vs Auto-increment Performance
Symptom
Insert performance degradation when using UUID Primary Keys
Cause
UUIDs are random, causing index fragmentation
Solution
:
PostgreSQL: Use
uuid_generate_v7()
(time-ordered UUID)
MySQL: Use
UUID_TO_BIN(UUID(), 1)
Or consider using Auto-increment BIGINT
References
Official Documentation
PostgreSQL Documentation
MySQL Documentation
MongoDB Schema Design Best Practices
Tools
dbdiagram.io
- ERD diagram creation
PgModeler
- PostgreSQL modeling tool
Prisma
- ORM + migrations
Learning Resources
Database Design Course (freecodecamp)
Use The Index, Luke
- SQL indexing guide
Metadata
Version
Current Version
1.0.0
Last Updated
2025-01-01
Compatible Platforms
Claude, ChatGPT, Gemini
返回排行榜