postgresql

安装量: 36
排名: #19405

安装

npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill postgresql
PostgreSQL Table Design
Use this skill when
Designing a schema for PostgreSQL
Selecting data types and constraints
Planning indexes, partitions, or RLS policies
Reviewing tables for scale and maintainability
Do not use this skill when
You are targeting a non-PostgreSQL database
You only need query tuning without schema changes
You require a DB-agnostic modeling guide
Instructions
Capture entities, access patterns, and scale targets (rows, QPS, retention).
Choose data types and constraints that enforce invariants.
Add indexes for real query paths and validate with
EXPLAIN
.
Plan partitioning or RLS where required by scale or access control.
Review migration impact and apply changes safely.
Safety
Avoid destructive DDL on production without backups and a rollback plan.
Use migrations and staging validation before applying schema changes.
Core Rules
Define a
PRIMARY KEY
for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer
BIGINT GENERATED ALWAYS AS IDENTITY
; use
UUID
only when global uniqueness/opacity is needed.
Normalize first (to 3NF)
to eliminate data redundancy and update anomalies; denormalize
only
for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
Add
NOT NULL
everywhere it’s semantically required; use
DEFAULT
s for common values.
Create
indexes for access paths you actually query
PK/unique (auto),
FK columns (manual!)
, frequent filters/sorts, and join keys.
Prefer
TIMESTAMPTZ
for event time;
NUMERIC
for money;
TEXT
for strings;
BIGINT
for integer values,
DOUBLE PRECISION
for floats (or
NUMERIC
for exact decimal arithmetic).
PostgreSQL “Gotchas”
Identifiers
unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use
snake_case
for table/column names.
Unique + NULLs
UNIQUE allows multiple NULLs. Use
UNIQUE (...) NULLS NOT DISTINCT
(PG15+) to restrict to one NULL.
FK indexes
PostgreSQL
does not
auto-index FK columns. Add them.
No silent coercions
length/precision overflows error out (no truncation). Example: inserting 999 into
NUMERIC(2,0)
fails with error, unlike some databases that silently truncate or round.
Sequences/identity have gaps
(normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
Heap storage
no clustered PK by default (unlike SQL Server/MySQL InnoDB);
CLUSTER
is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
MVCC
updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.
Data Types
IDs
:
BIGINT GENERATED ALWAYS AS IDENTITY
preferred (
GENERATED BY DEFAULT
also fine);
UUID
when merging/federating/used in a distributed system or for opaque IDs. Generate with
uuidv7()
(preferred if using PG18+) or
gen_random_uuid()
(if using an older PG version).
Integers
prefer
BIGINT
unless storage space is critical;
INTEGER
for smaller ranges; avoid
SMALLINT
unless constrained.
Floats
prefer
DOUBLE PRECISION
over
REAL
unless storage space is critical. Use
NUMERIC
for exact decimal arithmetic.
Strings
prefer
TEXT
; if length limits needed, use
CHECK (LENGTH(col) <= n)
instead of
VARCHAR(n)
; avoid
CHAR(n)
. Use
BYTEA
for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage:
PLAIN
(no TOAST),
EXTENDED
(compress + out-of-line),
EXTERNAL
(out-of-line, no compress),
MAIN
(compress, keep in-line if possible). Default
EXTENDED
usually optimal. Control with
ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy
and
ALTER TABLE tbl SET (toast_tuple_target = 4096)
for threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes on
LOWER(col)
(preferred unless column needs case-insensitive PK/FK/UNIQUE) or
CITEXT
.
Money
:
NUMERIC(p,s)
(never float).
Time
:
TIMESTAMPTZ
for timestamps;
DATE
for date-only;
INTERVAL
for durations. Avoid
TIMESTAMP
(without timezone). Use
now()
for transaction start time,
clock_timestamp()
for current wall-clock time.
Booleans
:
BOOLEAN
with
NOT NULL
constraint unless tri-state values are required.
Enums
:
CREATE TYPE ... AS ENUM
for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
Arrays
:
TEXT[]
,
INTEGER[]
, etc. Use for ordered lists where you query elements. Index with
GIN
for containment (
@>
,
<@
) and overlap (
&&
) queries. Access:
arr[1]
(1-indexed),
arr[1:3]
(slicing). Good for tags, categories; avoid for relations—use junction tables instead. Literal syntax:
'{val1,val2}'
or
ARRAY[val1,val2]
.
Range types
:
daterange
,
numrange
,
tstzrange
for intervals. Support overlap (
&&
), containment (
@>
), operators. Index with
GiST
. Good for scheduling, versioning, numeric ranges. Pick a bounds scheme and use it consistently; prefer
[)
(inclusive/exclusive) by default.
Network types
:
INET
for IP addresses,
CIDR
for network ranges,
MACADDR
for MAC addresses. Support network operators (
<<
,
>>
,
&&
).
Geometric types
:
POINT
,
LINE
,
POLYGON
,
CIRCLE
for 2D spatial data. Index with
GiST
. Consider
PostGIS
for advanced spatial features.
Text search
:
TSVECTOR
for full-text search documents,
TSQUERY
for search queries. Index
tsvector
with
GIN
. Always specify language:
to_tsvector('english', col)
and
to_tsquery('english', 'query')
. Never use single-argument versions. This applies to both index expressions and queries.
Domain types
:
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$')
for reusable custom types with validation. Enforces constraints across tables.
Composite types
:
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)
for structured data within columns. Access with
(col).field
syntax.
JSONB
preferred over JSON; index with
GIN
. Use only for optional/semi-structured attrs. ONLY use JSON if the original ordering of the contents MUST be preserved.
Vector types
:
vector
type by
pgvector
for vector similarity search for embeddings.
Do not use the following data types
DO NOT use
timestamp
(without time zone); DO use
timestamptz
instead.
DO NOT use
char(n)
or
varchar(n)
; DO use
text
instead.
DO NOT use
money
type; DO use
numeric
instead.
DO NOT use
timetz
type; DO use
timestamptz
instead.
DO NOT use
timestamptz(0)
or any other precision specification; DO use
timestamptz
instead
DO NOT use
serial
type; DO use
generated always as identity
instead.
Table Types
Regular
default; fully durable, logged.
TEMPORARY
session-scoped, auto-dropped, not logged. Faster for scratch work.
UNLOGGED
persistent but not crash-safe. Faster writes; good for caches/staging.
Row-Level Security
Enable with
ALTER TABLE tbl ENABLE ROW LEVEL SECURITY
. Create policies:
CREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())
. Built-in user-based access control at the row level.
Constraints
PK
implicit UNIQUE + NOT NULL; creates a B-tree index.
FK
specify
ON DELETE/UPDATE
action (
CASCADE
,
RESTRICT
,
SET NULL
,
SET DEFAULT
). Add explicit index on referencing column—speeds up joins and prevents locking issues on parent deletes/updates. Use
DEFERRABLE INITIALLY DEFERRED
for circular FK dependencies checked at transaction end.
UNIQUE
creates a B-tree index; allows multiple NULLs unless
NULLS NOT DISTINCT
(PG15+). Standard behavior:
(1, NULL)
and
(1, NULL)
are allowed. With
NULLS NOT DISTINCT
only one
(1, NULL)
allowed. Prefer
NULLS NOT DISTINCT
unless you specifically need duplicate NULLs.
CHECK
row-local constraints; NULL values pass the check (three-valued logic). Example:
CHECK (price > 0)
allows NULL prices. Combine with
NOT NULL
to enforce:
price NUMERIC NOT NULL CHECK (price > 0)
.
EXCLUDE
prevents overlapping values using operators.
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
prevents double-booking rooms. Requires appropriate index type (often GiST).
Indexing
B-tree

default for equality/range queries (

,
<
,
>
,
BETWEEN
,
ORDER BY
)
Composite
order matters—index used if equality on leftmost prefix (
WHERE a = ? AND b > ?
uses index on
(a,b)
, but
WHERE b = ?
does not). Put most selective/frequently filtered columns first.
Covering
:
CREATE INDEX ON tbl (id) INCLUDE (name, email)
- includes non-key columns for index-only scans without visiting table.
Partial
for hot subsets (
WHERE status = 'active'
CREATE INDEX ON tbl (user_id) WHERE status = 'active'
). Any query with
status = 'active'
can use this index.
Expression
for computed search keys (
CREATE INDEX ON tbl (LOWER(email))
). Expression must match exactly in WHERE clause:
WHERE LOWER(email) = 'user@example.com'
.
GIN
JSONB containment/existence, arrays (
@>
,
?
), full-text search (
@@
)
GiST
ranges, geometry, exclusion constraints
BRIN
very large, naturally ordered data (time-series)—minimal storage overhead. Effective when row order on disk correlates with indexed column (insertion order or after
CLUSTER
).
Partitioning
Use for very large tables (>100M rows) where queries consistently filter on partition key (often time/date).
Alternate use: use for tables where data maintenance tasks dictates e.g. data pruned or bulk replaced periodically
RANGE
common for time-series (
PARTITION BY RANGE (created_at)
). Create partitions:
CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
.
TimescaleDB
automates time-based or ID-based partitioning with retention policies and compression.
LIST
for discrete values (
PARTITION BY LIST (region)
). Example:
FOR VALUES IN ('us-east', 'us-west')
.
HASH
for even distribution when no natural key (
PARTITION BY HASH (user_id)
). Creates N partitions with modulus.
Constraint exclusion
requires
CHECK
constraints on partitions for query planner to prune. Auto-created for declarative partitioning (PG10+).
Prefer declarative partitioning or hypertables. Do NOT use table inheritance.
Limitations
no global UNIQUE constraints—include partition key in PK/UNIQUE. FKs from partitioned tables not supported; use triggers.
Special Considerations
Update-Heavy Tables
Separate hot/cold columns
—put frequently updated columns in separate table to minimize bloat.
Use
fillfactor=90
to leave space for HOT updates that avoid index maintenance.
Avoid updating indexed columns
—prevents beneficial HOT updates.
Partition by update patterns
—separate frequently updated rows in a different partition from stable data.
Insert-Heavy Workloads
Minimize indexes
—only create what you query; every index slows inserts.
Use
COPY
or multi-row
INSERT
instead of single-row inserts.
UNLOGGED tables
for rebuildable staging data—much faster writes.
Defer index creation
for bulk loads—>drop index, load data, recreate indexes.
Partition by time/hash
to distribute load.
TimescaleDB
automates partitioning and compression of insert-heavy data.
Use a natural key for primary key
such as a (timestamp, device_id) if enforcing global uniqueness is important many insert-heavy tables don't need a primary key at all.
If you do need a surrogate key,
Prefer
BIGINT GENERATED ALWAYS AS IDENTITY
over
UUID
.
Upsert-Friendly Design
Requires UNIQUE index
on conflict target columns—
ON CONFLICT (col1, col2)
needs exact matching unique index (partial indexes don't work).
Use
EXCLUDED.column
to reference would-be-inserted values; only update columns that actually changed to reduce write overhead.
DO NOTHING
faster
than
DO UPDATE
when no actual update needed.
Safe Schema Evolution
Transactional DDL
most DDL operations can run in transactions and be rolled back—
BEGIN; ALTER TABLE...; ROLLBACK;
for safe testing.
Concurrent index creation
:
CREATE INDEX CONCURRENTLY
avoids blocking writes but can't run in transactions.
Volatile defaults cause rewrites
adding
NOT NULL
columns with volatile defaults (e.g.,
now()
,
gen_random_uuid()
) rewrites entire table. Non-volatile defaults are fast.
Drop constraints before columns
:
ALTER TABLE DROP CONSTRAINT
then
DROP COLUMN
to avoid dependency issues.
Function signature changes
:
CREATE OR REPLACE
with different arguments creates overloads, not replacements. DROP old version if no overload desired.
Generated Columns
... GENERATED ALWAYS AS () STORED
for computed, indexable fields. PG18+ adds
VIRTUAL
columns (computed on read, not stored).
Extensions
pgcrypto
:
crypt()
for password hashing.
uuid-ossp
alternative UUID functions; prefer
pgcrypto
for new projects.
pg_trgm
fuzzy text search with
%
operator,
similarity()
function. Index with GIN for
LIKE '%pattern%'
acceleration.
citext
case-insensitive text type. Prefer expression indexes on
LOWER(col)
unless you need case-insensitive constraints.
btree_gin
/
btree_gist
enable mixed-type indexes (e.g., GIN index on both JSONB and text columns).
hstore
key-value pairs; mostly superseded by JSONB but useful for simple string mappings.
timescaledb
essential for time-series—automated partitioning, retention, compression, continuous aggregates.
postgis
comprehensive geospatial support beyond basic geometric types—essential for location-based applications.
pgvector
vector similarity search for embeddings.
pgaudit
audit logging for all database activity.
JSONB Guidance
Prefer
JSONB
with
GIN
index.
Default:
CREATE INDEX ON tbl USING GIN (jsonb_col);
→ accelerates:
Containment
jsonb_col @> '{"k":"v"}'
Key existence
jsonb_col ? 'k'
,
any/all keys
?|
,
?&
Path containment
on nested docs
Disjunction
jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])
Heavy
@>
workloads: consider opclass
jsonb_path_ops
for smaller/faster containment-only indexes:
CREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);
Trade-off
loses support for key existence ( ? , ?| , ?& ) queries—only supports containment ( @> ) Equality/range on a specific scalar field: extract and index with B-tree (generated column or expression): ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED; CREATE INDEX ON tbl (price); Prefer queries like WHERE price BETWEEN 100 AND 500 (uses B-tree) over WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500 without index. Arrays inside JSONB: use GIN + @> for containment (e.g., tags). Consider jsonb_path_ops if only doing containment. Keep core relations in tables; use JSONB for optional/variable attributes. Use constraints to limit allowed JSONB values in a column e.g. config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object') Examples Users CREATE TABLE users ( user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY , email TEXT NOT NULL UNIQUE , name TEXT NOT NULL , created_at TIMESTAMPTZ NOT NULL DEFAULT now ( ) ) ; CREATE UNIQUE INDEX ON users ( LOWER ( email ) ) ; CREATE INDEX ON users ( created_at ) ; Orders CREATE TABLE orders ( order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY , user_id BIGINT NOT NULL REFERENCES users ( user_id ) , status TEXT NOT NULL DEFAULT 'PENDING' CHECK ( status IN ( 'PENDING' , 'PAID' , 'CANCELED' ) ) , total NUMERIC ( 10 , 2 ) NOT NULL CHECK ( total

0 ) , created_at TIMESTAMPTZ NOT NULL DEFAULT now ( ) ) ; CREATE INDEX ON orders ( user_id ) ; CREATE INDEX ON orders ( created_at ) ; JSONB CREATE TABLE profiles ( user_id BIGINT PRIMARY KEY REFERENCES users ( user_id ) , attrs JSONB NOT NULL DEFAULT '{}' , theme TEXT GENERATED ALWAYS AS ( attrs -

'theme' ) STORED ) ; CREATE INDEX profiles_attrs_gin ON profiles USING GIN ( attrs ) ;

返回排行榜