postgresql-best-practices

安装量: 287
排名: #3149

安装

npx skills add https://github.com/mindrally/skills --skill postgresql-best-practices

PostgreSQL Best Practices Core Principles Leverage PostgreSQL's advanced features for robust data modeling Optimize queries using EXPLAIN ANALYZE and proper indexing strategies Use native PostgreSQL data types appropriately Implement proper connection pooling and resource management Follow PostgreSQL-specific security best practices Schema Design Data Types Use appropriate native types: UUID, JSONB, ARRAY, INET, CIDR Prefer TIMESTAMPTZ over TIMESTAMP for timezone-aware applications Use TEXT instead of VARCHAR when no length limit is needed Consider NUMERIC for precise decimal calculations (financial data) Use SERIAL or BIGSERIAL for auto-incrementing IDs, or UUID for distributed systems CREATE TABLE orders ( order_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), customer_id UUID NOT NULL REFERENCES customers(customer_id), order_data JSONB NOT NULL DEFAULT '{}', tags TEXT[] DEFAULT '{}', total_amount NUMERIC(12, 2) NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() );

Table Design Always define primary keys Use foreign keys with appropriate ON DELETE/UPDATE actions Add NOT NULL constraints where appropriate Use CHECK constraints for data validation Consider partitioning for large tables CREATE TABLE products ( product_id SERIAL PRIMARY KEY, sku VARCHAR(50) NOT NULL UNIQUE, name TEXT NOT NULL, price NUMERIC(10, 2) NOT NULL CHECK (price >= 0), status VARCHAR(20) NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'inactive', 'discontinued')), metadata JSONB DEFAULT '{}' );

Partitioning Use declarative partitioning for large tables (millions of rows) Choose appropriate partition strategy: RANGE, LIST, or HASH Create indexes on partitioned tables after partitioning CREATE TABLE events ( event_id BIGSERIAL, event_type VARCHAR(50) NOT NULL, payload JSONB, created_at TIMESTAMPTZ NOT NULL ) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_q1 PARTITION OF events FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

Indexing Strategies Index Types Use B-tree indexes (default) for equality and range queries Use GIN indexes for JSONB, arrays, and full-text search Use GiST indexes for geometric data and range types Use BRIN indexes for large, naturally ordered data Consider partial indexes for filtered queries -- B-tree index for common lookups CREATE INDEX idx_orders_customer ON orders(customer_id);

-- GIN index for JSONB queries CREATE INDEX idx_orders_data ON orders USING GIN (order_data);

-- Partial index for active records only CREATE INDEX idx_active_products ON products(name) WHERE status = 'active';

-- Covering index to avoid table lookup CREATE INDEX idx_orders_covering ON orders(customer_id) INCLUDE (order_date, total_amount);

Index Maintenance Regularly run ANALYZE to update statistics Use REINDEX for bloated indexes Monitor index usage with pg_stat_user_indexes Remove unused indexes to reduce write overhead -- Check index usage SELECT schemaname, relname, indexrelname, idx_scan, idx_tup_read FROM pg_stat_user_indexes ORDER BY idx_scan ASC;

Query Optimization EXPLAIN ANALYZE Always analyze query plans for slow queries Look for sequential scans on large tables Identify missing indexes from query plans Watch for high row estimates vs actual rows EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT c.name, COUNT(o.order_id) FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id WHERE c.created_at > '2024-01-01' GROUP BY c.customer_id, c.name;

Common Table Expressions (CTEs) Use CTEs for complex query organization Note: CTEs are optimization fences in older PostgreSQL versions Use MATERIALIZED/NOT MATERIALIZED hints in PostgreSQL 12+ WITH recent_orders AS MATERIALIZED ( SELECT customer_id, COUNT(*) AS order_count, SUM(total) AS total_spent FROM orders WHERE order_date > CURRENT_DATE - INTERVAL '30 days' GROUP BY customer_id ) SELECT c.name, ro.order_count, ro.total_spent FROM customers c JOIN recent_orders ro ON c.customer_id = ro.customer_id WHERE ro.total_spent > 1000;

Window Functions Use window functions for analytics queries Leverage PARTITION BY and ORDER BY for complex calculations SELECT order_id, customer_id, total_amount, SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS running_total, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS order_rank FROM orders;

JSONB Best Practices Use JSONB over JSON for better performance and indexing Create GIN indexes for JSONB columns you query Use containment operators (@>, <@) for efficient queries Extract frequently queried fields to regular columns -- Efficient JSONB query with GIN index SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';

-- Extract specific fields SELECT product_id, metadata->>'brand' AS brand, (metadata->>'rating')::numeric AS rating FROM products WHERE metadata ? 'rating';

Connection Management Connection Pooling Use PgBouncer or pgpool-II for connection pooling Set appropriate pool sizes based on workload Use transaction pooling mode for short-lived connections Connection Settings -- Recommended session settings SET statement_timeout = '30s'; SET lock_timeout = '10s'; SET idle_in_transaction_session_timeout = '60s';

Transactions and Locking Use appropriate transaction isolation levels Keep transactions short to reduce lock contention Use advisory locks for application-level locking Monitor and resolve lock conflicts -- Use advisory locks for application coordination SELECT pg_advisory_lock(hashtext('resource_name')); -- Do work SELECT pg_advisory_unlock(hashtext('resource_name'));

-- Check for blocking queries SELECT blocked_locks.pid AS blocked_pid, blocking_locks.pid AS blocking_pid, blocked_activity.query AS blocked_query FROM pg_catalog.pg_locks blocked_locks JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype AND blocking_locks.relation = blocked_locks.relation AND blocking_locks.pid != blocked_locks.pid JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid;

Maintenance Vacuum and Analyze Enable autovacuum and tune for your workload Run manual VACUUM ANALYZE after bulk operations Monitor table bloat -- Check table bloat SELECT schemaname, relname, n_live_tup, n_dead_tup, round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct FROM pg_stat_user_tables WHERE n_dead_tup > 1000 ORDER BY n_dead_tup DESC;

Backup Strategies Use pg_dump for logical backups Use pg_basebackup for physical backups Implement point-in-time recovery (PITR) with WAL archiving Test backup restoration regularly Security Use SSL/TLS for connections Implement row-level security (RLS) for multi-tenant applications Use roles and GRANT/REVOKE for access control Audit sensitive operations with pgAudit extension -- Enable row-level security ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY documents_tenant_policy ON documents FOR ALL USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- Grant minimal privileges GRANT SELECT, INSERT, UPDATE ON orders TO app_user; GRANT USAGE ON SEQUENCE orders_order_id_seq TO app_user;

Monitoring Monitor with pg_stat_statements extension Track slow queries and optimize regularly Set up alerts for replication lag, connection count, and disk usage Use pg_stat_activity to monitor active queries -- Enable pg_stat_statements CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slow queries SELECT query, calls, mean_exec_time, total_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;

返回排行榜