This skill covers using PostgreSQL with the pgvector extension as the storage backend for GrepAI.
When to Use This Skill
-
Team environments with shared index
-
Large codebases (10K+ files)
-
Need concurrent access
-
Integration with existing PostgreSQL infrastructure
Prerequisites
-
PostgreSQL 14+ with pgvector extension
-
Database user with create table permissions
-
Network access to PostgreSQL server
Advantages
| 👥 Team sharing | Multiple users can access same index
| 📏 Scalable | Handles large codebases
| 🔄 Concurrent | Multiple simultaneous searches
| 💾 Persistent | Data survives machine restarts
| 🔧 Familiar | Standard database tooling
Setting Up PostgreSQL with pgvector
Option 1: Docker (Recommended for Development)
# Run PostgreSQL with pgvector
docker run -d \
--name grepai-postgres \
-e POSTGRES_USER=grepai \
-e POSTGRES_PASSWORD=grepai \
-e POSTGRES_DB=grepai \
-p 5432:5432 \
pgvector/pgvector:pg16
Option 2: Install on Existing PostgreSQL
# Install pgvector extension (Ubuntu/Debian)
sudo apt install postgresql-16-pgvector
# Or compile from source
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
Then enable the extension:
-- Connect to your database
CREATE EXTENSION IF NOT EXISTS vector;
Option 3: Managed Services
-
Supabase: pgvector included by default
-
Neon: pgvector available
-
AWS RDS: Install pgvector extension
-
Azure Database: pgvector available
Configuration
Basic Configuration
# .grepai/config.yaml
store:
backend: postgres
postgres:
dsn: postgres://user:password@localhost:5432/grepai
With Environment Variable
store:
backend: postgres
postgres:
dsn: ${DATABASE_URL}
Set the environment variable:
export DATABASE_URL="postgres://user:password@localhost:5432/grepai"
Full DSN Options
store:
backend: postgres
postgres:
dsn: postgres://user:password@host:5432/database?sslmode=require
DSN components:
-
user: Database username -
password: Database password -
host: Server hostname or IP -
5432: Port (default: 5432) -
database: Database name -
sslmode: SSL mode (disable, require, verify-full)
SSL Modes
| disable
| No SSL
| Local development
| require
| SSL required
| Production
| verify-full
| SSL + verify certificate
| High security
# Production with SSL
store:
backend: postgres
postgres:
dsn: postgres://user:pass@prod.db.com:5432/grepai?sslmode=require
Database Schema
GrepAI automatically creates these tables:
-- Vector embeddings table
CREATE TABLE IF NOT EXISTS embeddings (
id SERIAL PRIMARY KEY,
file_path TEXT NOT NULL,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
start_line INTEGER,
end_line INTEGER,
embedding vector(768), -- Dimension matches your model
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(file_path, chunk_index)
);
-- Index for vector similarity search
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);
Verifying Setup
Check pgvector Extension
-- Connect to database
psql -U grepai -d grepai
-- Check extension is installed
SELECT * FROM pg_extension WHERE extname = 'vector';
-- Check GrepAI tables exist (after first grepai watch)
\dt
Test Connection from GrepAI
# Check status
grepai status
# Should show PostgreSQL backend info
Performance Tuning
PostgreSQL Configuration
For better vector search performance:
-- Increase work memory for vector operations
SET work_mem = '256MB';
-- Adjust for your hardware
SET effective_cache_size = '4GB';
SET shared_buffers = '1GB';
Index Tuning
For large indices, tune the IVFFlat index:
-- More lists = faster search, more memory
CREATE INDEX ON embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- Adjust based on row count
Rule of thumb: lists = sqrt(rows)
Concurrent Access
PostgreSQL handles concurrent access automatically:
-
Multiple
grepai searchcommands work simultaneously -
One
grepai watchdaemon per codebase -
Many users can share the same index
Team Setup
Shared Database
All team members point to the same database:
# Each developer's .grepai/config.yaml
store:
backend: postgres
postgres:
dsn: postgres://team:secret@shared-db.company.com:5432/grepai
Per-Project Databases
For isolated projects, use separate databases:
# Create databases
createdb -U postgres grepai_projecta
createdb -U postgres grepai_projectb
# Project A config
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai_projecta
Backup and Restore
Backup
pg_dump -U grepai -d grepai > grepai_backup.sql
Restore
psql -U grepai -d grepai < grepai_backup.sql
Migrating from GOB
-
Set up PostgreSQL with pgvector
-
Update configuration:
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai
- Delete old index:
rm .grepai/index.gob
- Re-index:
grepai watch
Common Issues
❌ Problem: FATAL: password authentication failed
✅ Solution: Check DSN credentials and pg_hba.conf
❌ Problem: ERROR: extension "vector" is not available
✅ Solution: Install pgvector:
sudo apt install postgresql-16-pgvector
# Then: CREATE EXTENSION vector;
❌ Problem: ERROR: type "vector" does not exist
✅ Solution: Enable extension in the database:
CREATE EXTENSION IF NOT EXISTS vector;
❌ Problem: Connection refused ✅ Solution:
-
Check PostgreSQL is running
-
Verify host and port
-
Check firewall rules
❌ Problem: Slow searches ✅ Solution:
-
Add IVFFlat index
-
Increase
work_mem -
Vacuum and analyze tables
Best Practices
-
Use environment variables: Don't commit credentials
-
Enable SSL: For remote databases
-
Regular backups: pg_dump before major changes
-
Monitor performance: Check query times
-
Index maintenance: Regular VACUUM ANALYZE
Output Format
PostgreSQL storage status:
✅ PostgreSQL Storage Configured
Backend: PostgreSQL + pgvector
Host: localhost:5432
Database: grepai
SSL: disabled
Contents:
- Files: 2,450
- Chunks: 12,340
- Vector dimension: 768
Performance:
- Connection: OK
- IVFFlat index: Yes
- Search latency: ~50ms