Database Query Optimization Overview
Slow database queries are a common performance bottleneck. Optimization through indexing, efficient queries, and caching dramatically improves application performance.
When to Use Slow response times High database CPU usage Performance regression New feature deployment Regular maintenance Instructions 1. Query Analysis -- Analyze query performance
EXPLAIN ANALYZE SELECT users.id, users.name, COUNT(orders.id) as order_count FROM users LEFT JOIN orders ON users.id = orders.user_id WHERE users.created_at > '2024-01-01' GROUP BY users.id, users.name ORDER BY order_count DESC;
-- Results show: -- - Seq Scan (slow) vs Index Scan (fast) -- - Rows: actual vs planned (high variance = bad) -- - Execution time (milliseconds)
-- Key metrics: -- - Sequential Scan: Full table read (slow) -- - Index Scan: Uses index (fast) -- - Nested Loop: Joins with loops -- - Sort: In-memory or disk sort
- Indexing Strategy Index Types:
Single Column: CREATE INDEX idx_users_email ON users(email); Use: WHERE email = ? Size: Small, quick to create
Composite Index: CREATE INDEX idx_orders_user_date ON orders(user_id, created_at); Use: WHERE user_id = ? AND created_at > ? Order: Most selective first
Covering Index: CREATE INDEX idx_orders_covering ON orders(user_id) INCLUDE (total_amount); Benefit: No table lookup needed
Partial Index: CREATE INDEX idx_active_users ON users(id) WHERE status = 'active'; Benefit: Smaller, faster
Full Text: CREATE FULLTEXT INDEX idx_search ON articles(title, content); Use: Text search queries
Index Rules:
- Create indexes for WHERE conditions
- Create indexes for JOIN columns
- Create indexes for ORDER BY
- Don't over-index (slows writes)
- Monitor index usage
- Remove unused indexes
- Update statistics regularly
- Partial indexes for filtered queries
Missing Index Query: SELECT object_name, equality_columns FROM sys.dm_db_missing_index_details ORDER BY equality_columns;
- Query Optimization Techniques
Common optimization patterns
BEFORE (N+1 queries)
for user in users: orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id) # 1 + N queries
AFTER (single query with JOIN)
orders = db.query(""" SELECT u., o. FROM users u LEFT JOIN orders o ON u.id = o.user_id WHERE u.created_at > ? """, date_threshold)
BEFORE (inefficient WHERE)
SELECT * FROM users WHERE LOWER(email) = LOWER('Test@Example.com')
Can't use index (function used)
AFTER (index-friendly)
SELECT * FROM users WHERE email = 'test@example.com'
Case-insensitive constraint + index
BEFORE (wildcard at start)
SELECT * FROM users WHERE email LIKE '%example.com'
Can't use index (wildcard at start)
AFTER (wildcard at end)
SELECT * FROM users WHERE email LIKE 'user%'
Can use index
BEFORE (slow aggregation)
SELECT user_id, COUNT(*) as cnt FROM orders GROUP BY user_id ORDER BY cnt DESC LIMIT 10
AFTER (pre-aggregated)
SELECT user_id, order_count FROM user_order_stats WHERE order_count IS NOT NULL ORDER BY order_count DESC LIMIT 10
- Optimization Checklist Analysis: [ ] Run EXPLAIN ANALYZE on slow queries [ ] Check actual vs estimated rows [ ] Look for sequential scans [ ] Identify expensive operations [ ] Compare execution plans
Indexing: [ ] Index WHERE columns [ ] Index JOIN columns [ ] Index ORDER BY columns [ ] Check unused indexes [ ] Remove duplicate indexes [ ] Create composite indexes strategically [ ] Analyze index statistics
Query Optimization: [ ] Remove unnecessary columns (SELECT *) [ ] Use JOINs instead of subqueries [ ] Avoid functions in WHERE [ ] Use wildcards carefully (avoid %) [ ] Batch operations [ ] Use LIMIT for result sets [ ] Archive old data
Caching: [ ] Implement query caching [ ] Cache aggregations [ ] Use Redis for hot data [ ] Invalidate strategically
Monitoring: [ ] Track slow queries [ ] Monitor index usage [ ] Set up alerts [ ] Regular statistics update [ ] Measure improvements
Expected Improvements:
With Proper Indexing: - Sequential Scan → Index Scan - Response time: 5 seconds → 50ms (100x faster) - CPU usage: 80% → 20% - Concurrent users: 100 → 1000
Quick Wins: - Add index to frequently filtered column - Fix N+1 queries - Use LIMIT for large results - Archive old data - Expected: 20-50% improvement
Key Points EXPLAIN ANALYZE shows query execution Indexes must match WHERE/JOIN/ORDER BY Avoid functions in WHERE clauses Fix N+1 queries (join instead of loop) Monitor slow query log regularly Stats updates needed for accuracy Pre-calculate aggregations Archive historical data Use explain plans before/after Measure and monitor continuously