- Pipeline:
- Index
-
- Load → Split → Embed → Store
- Retrieve
-
- Query → Embed → Search → Return docs
- Generate
-
- Docs + Query → LLM → Response
- Key Components:
- Document Loaders
-
- Ingest data from files, web, databases
- Text Splitters
-
- Break documents into chunks
- Embeddings
-
- Convert text to vectors
- Vector Stores
- Store and search embeddings
Vector Store
Use Case
Persistence
InMemory
Testing
Memory only
FAISS
Local, high performance
Disk
Chroma
Development
Disk
Pinecone
Production, managed
Cloud
Complete RAG Pipeline
1. Load documents
docs = [
Document(page_content="LangChain is a framework for LLM apps.", metadata={}),
Document(page_content="RAG = Retrieval Augmented Generation.", metadata={}),
]
2. Split documents
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
splits = splitter.split_documents(docs)
3. Create embeddings and store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = InMemoryVectorStore.from_documents(splits, embeddings)
4. Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
5. Use in RAG
model = ChatOpenAI(model="gpt-4.1")
query = "What is RAG?"
relevant_docs = retriever.invoke(query)
context = "\n\n".join([doc.page_content for doc in relevant_docs])
response = model.invoke([
{"role": "system", "content": f"Use this context:\n\n{context}"},
{"role": "user", "content": query},
])
End-to-end RAG pipeline: load documents, split into chunks, embed, store, retrieve, and generate a response. typescript import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai"; import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory"; import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters"; import { Document } from "@langchain/core/documents"; // 1. Load documents const docs = [ new Document({ pageContent: "LangChain is a framework for LLM apps.", metadata: {} }), new Document({ pageContent: "RAG = Retrieval Augmented Generation.", metadata: {} }), ]; // 2. Split documents const splitter = new RecursiveCharacterTextSplitter({ chunkSize: 500, chunkOverlap: 50 }); const splits = await splitter.splitDocuments(docs); // 3. Create embeddings and store const embeddings = new OpenAIEmbeddings({ model: "text-embedding-3-small" }); const vectorstore = await MemoryVectorStore.fromDocuments(splits, embeddings); // 4. Create retriever const retriever = vectorstore.asRetriever({ k: 4 }); // 5. Use in RAG const model = new ChatOpenAI({ model: "gpt-4.1" }); const query = "What is RAG?"; const relevantDocs = await retriever.invoke(query); const context = relevantDocs.map(doc => doc.pageContent).join("\n\n"); const response = await model.invoke([ { role: "system", content: `Use this context:\n\n${context}` }, { role: "user", content: query }, ]); Document Loaders loader = PyPDFLoader("./document.pdf") docs = loader.load() print(f"Loaded {len(docs)} pages") </python> <typescript> Load a PDF file and extract each page as a separate document.typescript import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf"; const loader = new PDFLoader("./document.pdf"); const docs = await loader.load(); console.log(Loaded ${docs.length} pages); loader = WebBaseLoader(" https://docs.langchain.com ") docs = loader.load()Fetch and parse content from a web URL into a document using Cheerio. ```typescript import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio"; const loader = new CheerioWebBaseLoader("https://docs.langchain.com"); const docs = await loader.load(); Load all text files from directory loader = DirectoryLoader( "path/to/documents", glob="*/.txt", # Pattern for files to load loader_cls=TextLoader ) docs = loader.load()
Text Splitting
python
from langchain_text_splitters import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Characters per chunk
chunk_overlap=200, # Overlap for context continuity
separators=["\n\n", "\n", " ", ""], # Split hierarchy
)
splits = splitter.split_documents(docs)
Vector Stores
vectorstore = Chroma.from_documents(
documents=splits,
embedding=OpenAIEmbeddings(),
persist_directory="./chroma_db",
collection_name="my-collection",
)
Load existing
vectorstore = Chroma(
persist_directory="./chroma_db",
embedding_function=OpenAIEmbeddings(),
collection_name="my-collection",
)
</python>
<typescript>
Create a Chroma vector store connected to a running Chroma server.typescript
import { Chroma } from "@langchain/community/vectorstores/chroma";
import { OpenAIEmbeddings } from "@langchain/openai";
const vectorstore = await Chroma.fromDocuments(
splits,
new OpenAIEmbeddings(),
{ collectionName: "my-collection", url: "http://localhost:8000" }
);
vectorstore = FAISS.from_documents(splits, embeddings)
vectorstore.save_local("./faiss_index")
Load (requires allow_dangerous_deserialization)
loaded = FAISS.load_local(
"./faiss_index",
embeddings,
allow_dangerous_deserialization=True
)
typescript
import { FaissStore } from "@langchain/community/vectorstores/faiss";
const vectorstore = await FaissStore.fromDocuments(splits, embeddings);
await vectorstore.save("./faiss_index");
const loaded = await FaissStore.load("./faiss_index", embeddings);
Retrieval
With scores
results_with_score = vectorstore.similarity_search_with_score(query, k=5)
for doc, score in results_with_score:
print(f"Score: {score}, Content: {doc.page_content}")
</python>
<typescript>
Perform similarity search and retrieve results with relevance scores.typescript
// Basic search
const results = await vectorstore.similaritySearch(query, 5);
// With scores
const resultsWithScore = await vectorstore.similaritySearchWithScore(query, 5);
for (const [doc, score] of resultsWithScore) {
console.log(Score: ${score}, Content: ${doc.pageContent});
}
Search with filter
results = vectorstore.similarity_search(
"programming",
k=5,
filter={"language": "python"} # Only Python docs
)
What You CAN Configure
- Chunk size/overlap
- Embedding model
- Number of results (k)
- Metadata filters
- Search algorithms: Similarity, MMR
What You CANNOT Configure
- Embedding dimensions (per model)
- Mix embeddings from different models in same store
Chunk size 500-1500 is typically good. ```python
WRONG: Too small (loses context) or too large (hits limits)
splitter = RecursiveCharacterTextSplitter(chunk_size=50) splitter = RecursiveCharacterTextSplitter(chunk_size=10000)
CORRECT
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
// CORRECT
const splitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200 });
WRONG: No overlap - context breaks at boundaries
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
CORRECT: 10-20% overlap
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
CORRECT
vectorstore = Chroma.from_documents(docs, embeddings, persist_directory="./chroma_db")
typescript
// WRONG: Memory - lost on restart
const vectorstore = await MemoryVectorStore.fromDocuments(docs, embeddings);
// CORRECT
const vectorstore = await Chroma.fromDocuments(docs, embeddings, { collectionName: "my-collection" });
CORRECT: Same model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever() # Uses same embeddings
</python>
<typescript>
Use the same embedding model for indexing and querying.typescript
const embeddings = new OpenAIEmbeddings({ model: "text-embedding-3-small" });
const vectorstore = await Chroma.fromDocuments(docs, embeddings);
const retriever = vectorstore.asRetriever(); // Uses same embeddings
CORRECT
loaded_store = FAISS.load_local("./faiss_index", embeddings, allow_dangerous_deserialization=True)
WRONG: Index has 1536 dimensions but using 512-dim embeddings
pc.create_index(name="idx", dimension=1536, metric="cosine") vectorstore = PineconeVectorStore.from_documents( docs, OpenAIEmbeddings(model="text-embedding-3-small", dimensions=512), index=pc.Index("idx") ) # Error: dimension mismatch!
CORRECT: Match dimensions
embeddings = OpenAIEmbeddings() # Default 1536