PostgreSQL Semantic Search Quick Start 1. Setup CREATE EXTENSION IF NOT EXISTS vector ; CREATE TABLE documents ( id SERIAL PRIMARY KEY , content TEXT NOT NULL , embedding vector ( 1536 ) -- text-embedding-3-small -- Or: embedding halfvec(3072) -- text-embedding-3-large (50% memory) ) ; 2. Basic Semantic Search SELECT id , content , 1 - ( embedding <=> query_vec ) AS similarity FROM documents ORDER BY embedding <=> query_vec LIMIT 10 ; 3. Add Index (> 10k documents) CREATE INDEX ON documents USING hnsw ( embedding vector_cosine_ops ) ; Docker Quick Start
pgvector with PostgreSQL 17
docker run -d --name pgvector-db \ -e POSTGRES_PASSWORD = postgres \ -p 5432 :5432 \ pgvector/pgvector:pg17
Or PostgreSQL 18 (latest)
docker run -d --name pgvector-db \ -e POSTGRES_PASSWORD = postgres \ -p 5432 :5432 \ pgvector/pgvector:pg18
ParadeDB (includes pgvector + pg_search + BM25)
docker run -d --name paradedb \ -e POSTGRES_PASSWORD = postgres \ -p 5432 :5432 \ paradedb/paradedb:latest Connect: psql postgresql://postgres:postgres@localhost:5432/postgres Cheat Sheet Distance Operators embedding <=> query -- Cosine distance (1 - similarity) embedding < -
query -- L2/Euclidean distance embedding <
> query -- Negative inner product
- Common Queries
- -- Top 10 similar (cosine)
- SELECT
- *
- FROM
- docs
- ORDER
- BY
- embedding
- <=>
- $
- 1
- LIMIT
- 10
- ;
- -- With similarity score
- SELECT
- *
- ,
- 1
- -
- (
- embedding
- <=>
- $
- 1
- )
- AS
- similarity
- FROM
- docs
- ORDER
- BY
- 2
- DESC
- LIMIT
- 10
- ;
- -- With threshold
- SELECT
- *
- FROM
- docs
- WHERE
- embedding
- <=>
- $
- 1
- <
- 0.3
- ORDER
- BY
- 1
- LIMIT
- 10
- ;
- -- Preload index (run on startup)
- SELECT
- 1
- FROM
- docs
- ORDER
- BY
- embedding
- <=>
- $
- 1
- LIMIT
- 1
- ;
- Index Quick Reference
- -- HNSW (recommended)
- CREATE
- INDEX
- ON
- docs
- USING
- hnsw
- (
- embedding vector_cosine_ops
- )
- ;
- -- With tuning
- CREATE
- INDEX
- ON
- docs
- USING
- hnsw
- (
- embedding vector_cosine_ops
- )
- WITH
- (
- m
- =
- 24
- ,
- ef_construction
- =
- 200
- )
- ;
- -- Query-time recall
- SET
- hnsw
- .
- ef_search
- =
- 100
- ;
- -- Iterative scan for filtered queries (pgvector 0.8+)
- SET
- hnsw
- .
- iterative_scan
- =
- relaxed_order
- ;
- SET
- ivfflat
- .
- iterative_scan
- =
- on
- ;
- Decision Trees
- Choose Search Method
- Query type?
- ├─ Conceptual/meaning-based → Pure vector search
- ├─ Exact terms/names → Pure keyword search
- └─ Mixed/unknown → Hybrid search
- ├─ Simple setup → FTS + RRF (no extra extensions)
- ├─ Better ranking → BM25 + RRF (pg_search extension)
- └─ Full-featured → ParadeDB (Elasticsearch alternative)
- Choose Index Type
- Document count?
- ├─ < 10,000 → No index needed
- ├─ 10k - 1M → HNSW (best recall)
- └─ > 1M → IVFFlat (less memory) or HNSW
- Choose Vector Type
- Embedding model?
- ├─ text-embedding-3-small (1536) → vector(1536)
- ├─ text-embedding-3-large (3072) → halfvec(3072) (50% memory savings)
- └─ Other models → vector(dimensions)
- Operators
- Operator
- Distance
- Use Case
- <=>
- Cosine
- Text embeddings (default)
- <->
- L2/Euclidean
- Image embeddings
- <#>
- Inner product
- Normalized vectors
- SQL Functions
- Semantic Search
- match_documents(query_vec, threshold, limit)
- - Basic search
- match_documents_filtered(query_vec, metadata_filter, threshold, limit)
- - With JSONB filter
- match_chunks(query_vec, threshold, limit)
- - Search document chunks
- Hybrid Search (FTS)
- hybrid_search_fts(query_vec, query_text, limit, rrf_k, language)
- - FTS + RRF
- hybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight)
- - Linear combination
- hybrid_search_fallback(query_vec, query_text, limit)
- - Graceful degradation
- Hybrid Search (BM25)
- hybrid_search_bm25(query_vec, query_text, limit, rrf_k)
- - BM25 + RRF
- hybrid_search_bm25_highlighted(...)
- - With snippet highlighting
- hybrid_search_chunks_bm25(...)
- - For RAG with chunks
- Re-ranking (Optional)
- Two-stage retrieval improves precision: fast recall → precise rerank.
- When to Use
- Results need higher precision
- Using < 50 candidates after initial search
- Have budget for API calls (Cohere) or compute (local models)
- Options
- Method
- Latency
- Quality
- Cost
- Cohere Rerank v4.0-fast
- ~150ms
- Excellent
- $0.001/query
- Cohere Rerank v4.0-pro
- ~300ms
- Best
- $0.002/query
- Zerank 2
- ~100ms
- Best
- API cost
- Voyage Rerank 2.5
- ~100ms
- Excellent
- API cost
- Cross-encoder (local)
- ~500ms
- Very Good
- Compute
- TypeScript Example (Cohere)
- import
- {
- CohereClient
- }
- from
- 'cohere-ai'
- ;
- const
- cohere
- =
- new
- CohereClient
- (
- {
- token
- :
- process
- .
- env
- .
- COHERE_API_KEY
- }
- )
- ;
- async
- function
- rerankResults
- (
- query
- :
- string
- ,
- documents
- :
- string
- [
- ]
- )
- {
- const
- response
- =
- await
- cohere
- .
- rerank
- (
- {
- model
- :
- 'rerank-v4.0-fast'
- ,
- // or 'rerank-v4.0-pro' for best quality
- query
- ,
- documents
- ,
- topN
- :
- 10
- ,
- }
- )
- ;
- return
- response
- .
- results
- ;
- }
- reranking.md
- - Detailed guide
- References
- paradedb.md
- - ParadeDB full-text search (Elasticsearch alternative)
- vector-types.md
- - vector vs halfvec, dimensions, storage
- indexing.md
- - HNSW, IVFFlat, GIN parameters
- hybrid-search.md
- - FTS, BM25, RRF algorithms
- performance.md
- - Cold-start, memory, HNSW vs IVFFlat
- Scripts
- setup.sql
- - Extension and table setup
- semantic_search.sql
- - Semantic search functions
- hybrid_search_fts.sql
- - FTS hybrid functions
- hybrid_search_bm25.sql
- - BM25 hybrid functions
- indexes.sql
- - Index creation scripts
- Common Patterns
- TypeScript Integration (Supabase)
- // Semantic search
- const
- {
- data
- }
- =
- await
- supabase
- .
- rpc
- (
- 'match_documents'
- ,
- {
- query_embedding
- :
- embedding
- ,
- match_threshold
- :
- 0.7
- ,
- match_count
- :
- 10
- }
- )
- ;
- // Hybrid search
- const
- {
- data
- }
- =
- await
- supabase
- .
- rpc
- (
- 'hybrid_search_fts'
- ,
- {
- query_embedding
- :
- embedding
- ,
- query_text
- :
- userQuery
- ,
- match_count
- :
- 10
- ,
- rrf_k
- :
- 60
- ,
- fts_language
- :
- 'simple'
- }
- )
- ;
- Drizzle ORM
- import
- {
- sql
- }
- from
- 'drizzle-orm'
- ;
- const
- results
- =
- await
- db
- .
- execute
- (
- sql
- `
- SELECT * FROM match_documents(
- ${
- embedding
- }
- ::vector(1536),
- 0.7,
- 10
- )
- `
- )
- ;
- Troubleshooting
- Symptom
- Cause
- Solution
- Index not used
- < 10k rows or planner choice
- Normal for small tables, check with EXPLAIN
- Slow first query (30-60s)
- HNSW cold-start
- SELECT pg_prewarm('idx_name')
- or preload query
- Poor recall
- Low ef_search
- SET hnsw.ef_search = 100
- or higher
- FTS returns nothing
- Wrong language config
- Use
- 'simple'
- for mixed/unknown languages
- Memory error on index build
- maintenance_work_mem too low
- Increase to 2GB+
- Cosine similarity > 1
- Vectors not normalized
- Normalize before insert or use L2
- Slow inserts
- Index overhead
- Batch inserts, consider IVFFlat
- Version Info (January 2026)
- PostgreSQL 18.1
-
- Latest maintenance release with security fixes (Nov 2025)
- PostgreSQL 17.7
-
- Stable LTS option
- pgvector 0.8.1
-
- Iterative scans, PostgreSQL 18 support, halfvec up to 4000 dims
- pg_search 0.21.2
-
- MVCC visibility, parallel aggregation, varchar[] indexing
- Cohere Rerank v4.0
- 32K context, 100+ languages, self-learning (Dec 2025) External Documentation pgvector GitHub - Official extension, latest features OpenAI Embeddings Guide - Embedding models and best practices Supabase Vector Guide - Supabase-specific integration ParadeDB pg_search - BM25 extension documentation PostgreSQL FTS - Built-in full-text search