google-gemini-file-search

安装量: 332
排名: #2791

安装

npx skills add https://github.com/jezweb/claude-skills --skill google-gemini-file-search

Google Gemini File Search Setup Overview

Google Gemini File Search is a fully managed RAG system. Upload documents (100+ formats: PDF, Word, Excel, code) and query with natural language—automatic chunking, embeddings, semantic search, and citations.

What This Skill Provides:

Complete @google/genai File Search API setup 8 documented errors with prevention strategies Chunking best practices for optimal retrieval Cost optimization ($0.15/1M tokens indexing, 3x storage multiplier) Cloudflare Workers + Next.js integration templates Prerequisites 1. Google AI API Key

Create an API key at https://aistudio.google.com/apikey

Free Tier Limits:

1 GB storage (total across all file search stores) 1,500 requests per day 1 million tokens per minute

Paid Tier Pricing:

Indexing: $0.15 per 1M input tokens (one-time) Storage: Free (Tier 1: 10 GB, Tier 2: 100 GB, Tier 3: 1 TB) Query-time embeddings: Free (retrieved context counts as input tokens) 2. Node.js Environment

Minimum Version: Node.js 18+ (v20+ recommended)

node --version # Should be >=18.0.0

  1. Install @google/genai SDK npm install @google/genai

or

pnpm add @google/genai

or

yarn add @google/genai

Current Stable Version: 1.30.0+ (verify with npm view @google/genai version)

⚠️ Important: File Search API requires @google/genai v1.29.0 or later. Earlier versions do not support File Search. The API was added in v1.29.0 (November 5, 2025).

  1. TypeScript Configuration (Optional but Recommended) { "compilerOptions": { "target": "ES2020", "module": "ESNext", "moduleResolution": "node", "esModuleInterop": true, "strict": true, "skipLibCheck": true } }

Common Errors Prevented

This skill prevents 12 common errors encountered when implementing File Search:

Error 1: Document Immutability

Symptom:

Error: Documents cannot be modified after indexing

Cause: Documents are immutable once indexed. There is no PATCH or UPDATE operation.

Prevention: Use the delete+re-upload pattern for updates:

// ❌ WRONG: Trying to update document (no such API) await ai.fileSearchStores.documents.update({ name: documentName, customMetadata: { version: '2.0' } })

// ✅ CORRECT: Delete then re-upload const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name })

const oldDoc = docs.documents.find(d => d.displayName === 'manual.pdf') if (oldDoc) { await ai.fileSearchStores.documents.delete({ name: oldDoc.name, force: true }) }

await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('manual-v2.pdf'), config: { displayName: 'manual.pdf' } })

Source: https://ai.google.dev/api/file-search/documents

Error 2: Storage Quota Exceeded

Symptom:

Error: Quota exceeded. Expected 1GB limit, but 3.2GB used.

Cause: Storage calculation includes input files + embeddings + metadata. Total storage ≈ 3x input size.

Prevention: Calculate storage before upload:

// ❌ WRONG: Assuming storage = file size const fileSize = fs.statSync('data.pdf').size // 500 MB // Expect 500 MB usage → WRONG

// ✅ CORRECT: Account for 3x multiplier const fileSize = fs.statSync('data.pdf').size // 500 MB const estimatedStorage = fileSize * 3 // 1.5 GB (embeddings + metadata) console.log(Estimated storage: ${estimatedStorage / 1e9} GB)

// Check if within quota before upload if (estimatedStorage > 1e9) { console.warn('⚠️ File may exceed free tier 1 GB limit') }

Source: https://blog.google/technology/developers/file-search-gemini-api/

Error 3: Incorrect Chunking Configuration

Symptom: Poor retrieval quality, irrelevant results, or context cutoff mid-sentence.

Cause: Default chunking may not be optimal for your content type.

Prevention: Use recommended chunking strategy:

// ❌ WRONG: Using defaults without testing await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('docs.pdf') // Default chunking may be too large or too small })

// ✅ CORRECT: Configure chunking for precision await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('docs.pdf'), config: { chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 500, // Smaller chunks = more precise retrieval maxOverlapTokens: 50 // 10% overlap prevents context loss } } } })

Chunking Guidelines:

Technical docs/code: 500 tokens/chunk, 50 overlap Prose/articles: 800 tokens/chunk, 80 overlap Legal/contracts: 300 tokens/chunk, 30 overlap (high precision)

Source: https://www.philschmid.de/gemini-file-search-javascript

Error 4: Metadata Limits Exceeded

Symptom:

Error: Maximum 20 custom metadata key-value pairs allowed

Cause: Each document can have at most 20 metadata fields.

Prevention: Design compact metadata schema:

// ❌ WRONG: Too many metadata fields await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('doc.pdf'), config: { customMetadata: { doc_type: 'manual', version: '1.0', author: 'John Doe', department: 'Engineering', created_date: '2025-01-01', // ... 18 more fields → Error! } } })

// ✅ CORRECT: Use hierarchical keys or JSON strings await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('doc.pdf'), config: { customMetadata: { doc_type: 'manual', version: '1.0', author_dept: 'John Doe|Engineering', // Combine related fields dates: JSON.stringify({ // Or use JSON for complex data created: '2025-01-01', updated: '2025-01-15' }) } } })

Source: https://ai.google.dev/api/file-search/documents

Error 5: Indexing Cost Surprises

Symptom: Unexpected bill for $375 after uploading 10 GB of documents.

Cause: Indexing costs are one-time but calculated per input token ($0.15/1M tokens).

Prevention: Estimate costs before indexing:

// ❌ WRONG: No cost estimation await uploadAllDocuments(fileStore.name, './data') // 10 GB uploaded → $375 surprise

// ✅ CORRECT: Calculate costs upfront const totalSize = getTotalDirectorySize('./data') // 10 GB const estimatedTokens = (totalSize / 4) // Rough estimate: 1 token ≈ 4 bytes const indexingCost = (estimatedTokens / 1e6) * 0.15

console.log(Estimated indexing cost: $${indexingCost.toFixed(2)}) console.log(Estimated storage: ${(totalSize * 3) / 1e9} GB)

// Confirm before proceeding const proceed = await confirm(Proceed with indexing? Cost: $${indexingCost.toFixed(2)}) if (proceed) { await uploadAllDocuments(fileStore.name, './data') }

Cost Examples:

1 GB text ≈ 250M tokens = $37.50 indexing 100 MB PDF ≈ 25M tokens = $3.75 indexing 10 MB code ≈ 2.5M tokens = $0.38 indexing

Source: https://ai.google.dev/pricing

Error 6: Not Polling Operation Status

Symptom: Query returns no results immediately after upload, or incomplete indexing.

Cause: File uploads are processed asynchronously. Must poll operation until done: true.

Prevention: Always poll operation status with timeout and fallback:

// ❌ WRONG: Assuming upload is instant const operation = await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('large.pdf') }) // Immediately query → No results!

// ✅ CORRECT: Poll until indexing complete with timeout const operation = await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('large.pdf') })

// Poll with timeout and fallback const MAX_POLL_TIME = 60000 // 60 seconds const POLL_INTERVAL = 1000 let elapsed = 0

while (!operation.done && elapsed < MAX_POLL_TIME) { await new Promise(resolve => setTimeout(resolve, POLL_INTERVAL)) elapsed += POLL_INTERVAL

try { operation = await ai.operations.get({ name: operation.name }) console.log(Indexing progress: ${operation.metadata?.progress || 'processing...'}) } catch (error) { console.warn('Polling failed, assuming complete:', error) break } }

if (operation.error) { throw new Error(Indexing failed: ${operation.error.message}) }

// ⚠️ Warning: operations.get() can be unreliable for large files // If timeout reached, verify document exists manually if (elapsed >= MAX_POLL_TIME) { console.warn('Polling timeout - verifying document manually') const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name }) const uploaded = docs.documents?.find(d => d.displayName === 'large.pdf') if (uploaded) { console.log('✅ Document found despite polling timeout') } else { throw new Error('Upload failed - document not found') } }

console.log('✅ Indexing complete:', operation.response?.displayName)

Source: https://ai.google.dev/api/file-search/file-search-stores#uploadtofilesearchstore, GitHub Issue #1211

Error 7: Forgetting Force Delete

Symptom:

Error: Cannot delete store with documents. Set force=true.

Cause: Stores with documents require force: true to delete (prevents accidental deletion).

Prevention: Always use force: true when deleting non-empty stores:

// ❌ WRONG: Trying to delete store with documents await ai.fileSearchStores.delete({ name: fileStore.name }) // Error: Cannot delete store with documents

// ✅ CORRECT: Use force delete await ai.fileSearchStores.delete({ name: fileStore.name, force: true // Deletes store AND all documents })

// Alternative: Delete documents first const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name }) for (const doc of docs.documents || []) { await ai.fileSearchStores.documents.delete({ name: doc.name, force: true }) } await ai.fileSearchStores.delete({ name: fileStore.name })

Source: https://ai.google.dev/api/file-search/file-search-stores#delete

Error 8: Using Unsupported Models

Symptom:

Error: File Search is only supported for Gemini 3 Pro and Flash models

Cause: File Search requires Gemini 3 Pro or Gemini 3 Flash. Gemini 2.x and 1.5 models are not supported.

Prevention: Always use Gemini 3 models:

// ❌ WRONG: Using Gemini 1.5 model const response = await ai.models.generateContent({ model: 'gemini-1.5-pro', // Not supported! contents: 'What is the installation procedure?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }] } })

// ✅ CORRECT: Use Gemini 3 models const response = await ai.models.generateContent({ model: 'gemini-3-flash', // ✅ Supported (fast, cost-effective) // OR // model: 'gemini-3-pro', // ✅ Supported (higher quality) contents: 'What is the installation procedure?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }] } })

Source: https://ai.google.dev/gemini-api/docs/file-search

Error 9: displayName Not Preserved for Blob Sources (Fixed v1.34.0+)

Symptom:

groundingChunks[0].title === null // No document source shown

Cause: In @google/genai versions prior to v1.34.0, when uploading files as Blob objects (not file paths), the SDK dropped the displayName and customMetadata configuration fields.

Prevention:

// ✅ CORRECT: Upgrade to v1.34.0+ for automatic fix npm install @google/genai@latest // v1.34.0+

await ai.fileSearchStores.uploadToFileSearchStore({ name: storeName, file: new Blob([arrayBuffer], { type: 'application/pdf' }), config: { displayName: 'Safety Manual.pdf', // ✅ Now preserved customMetadata: { version: '1.0' } // ✅ Now preserved } })

// ⚠️ WORKAROUND for v1.33.0 and earlier: Use resumable upload const uploadUrl = https://generativelanguage.googleapis.com/upload/v1beta/${storeName}:uploadToFileSearchStore?key=${API_KEY}

// Step 1: Initiate with displayName in body const initResponse = await fetch(uploadUrl, { method: 'POST', headers: { 'X-Goog-Upload-Protocol': 'resumable', 'X-Goog-Upload-Command': 'start', 'X-Goog-Upload-Header-Content-Length': numBytes.toString(), 'X-Goog-Upload-Header-Content-Type': 'application/pdf', 'Content-Type': 'application/json' }, body: JSON.stringify({ displayName: 'Safety Manual.pdf' // ✅ Works with resumable upload }) })

// Step 2: Upload file bytes const uploadUrl2 = initResponse.headers.get('X-Goog-Upload-URL') await fetch(uploadUrl2, { method: 'PUT', headers: { 'Content-Length': numBytes.toString(), 'X-Goog-Upload-Offset': '0', 'X-Goog-Upload-Command': 'upload, finalize', 'Content-Type': 'application/pdf' }, body: fileBytes })

Source: GitHub Issue #1078

Error 10: Grounding Metadata Ignored with JSON Response Mode

Symptom:

response.candidates[0].groundingMetadata === undefined // Even though fileSearch tool is configured

Cause: When using responseMimeType: 'application/json' for structured output, the API ignores the fileSearch tool and returns no grounding metadata, even with Gemini 3 models.

Prevention:

// ❌ WRONG: Structured output overrides grounding const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'Summarize guidelines', config: { responseMimeType: 'application/json', // Loses grounding tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] } })

// ✅ CORRECT: Two-step approach // Step 1: Get grounded text response const textResponse = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'Summarize guidelines', config: { tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] } })

const grounding = textResponse.candidates[0].groundingMetadata

// Step 2: Convert to structured format in prompt const jsonResponse = await ai.models.generateContent({ model: 'gemini-3-flash', contents: `Convert to JSON: ${textResponse.text}

Format: { "summary": "...", "key_points": ["..."] }`, config: { responseMimeType: 'application/json', responseSchema: { type: 'object', properties: { summary: { type: 'string' }, key_points: { type: 'array', items: { type: 'string' } } } } } })

// Combine results const result = { data: JSON.parse(jsonResponse.text), sources: grounding.groundingChunks }

Source: GitHub Issue #829

Error 11: Google Search and File Search Tools Are Mutually Exclusive

Symptom:

Error: "Search as a tool and file search tool are not supported together" Status: INVALID_ARGUMENT

Cause: The Gemini API does not allow using googleSearch and fileSearch tools in the same request.

Prevention:

// ❌ WRONG: Combining search tools const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'What are the latest industry guidelines?', config: { tools: [ { googleSearch: {} }, { fileSearch: { fileSearchStoreNames: [storeName] } } ] } })

// ✅ CORRECT: Use separate specialist agents async function searchWeb(query: string) { return ai.models.generateContent({ model: 'gemini-3-flash', contents: query, config: { tools: [{ googleSearch: {} }] } }) }

async function searchDocuments(query: string) { return ai.models.generateContent({ model: 'gemini-3-flash', contents: query, config: { tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] } }) }

// Orchestrate based on query type
const needsWeb = query.includes('latest') || query.includes('current')
const response = needsWeb
? await searchWeb(query)
await searchDocuments(query)

Source: GitHub Issue #435, Google Codelabs

Error 12: Batch API Missing Response Metadata (Community-sourced)

Symptom: Cannot correlate batch responses with requests when using metadata field.

Cause: When using Batch API with InlinedRequest that includes a metadata field, the corresponding InlinedResponse does not return the metadata.

Prevention:

// ❌ WRONG: Expecting metadata in response const batchRequest = { metadata: { key: 'my-request-id' }, contents: [{ parts: [{ text: 'Question?' }], role: 'user' }], config: { tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] } }

const batchResponse = await ai.batch.create({ requests: [batchRequest] }) console.log(batchResponse.responses[0].metadata) // ❌ undefined

// ✅ CORRECT: Use array index to correlate const requests = [ { metadata: { id: 'req-1' }, contents: [...] }, { metadata: { id: 'req-2' }, contents: [...] } ]

const responses = await ai.batch.create({ requests })

// Map by index (not ideal but works) responses.responses.forEach((response, i) => { const requestMetadata = requests[i].metadata console.log(Response for ${requestMetadata.id}:, response) })

Community Verification: Maintainer confirmed, internal bug filed.

Source: GitHub Issue #1191

Setup Instructions Step 1: Initialize Client import { GoogleGenAI } from '@google/genai' import fs from 'fs'

// Initialize client with API key const ai = new GoogleGenAI({ apiKey: process.env.GOOGLE_API_KEY })

// Verify API key is set if (!process.env.GOOGLE_API_KEY) { throw new Error('GOOGLE_API_KEY environment variable is required') }

Step 2: Create File Search Store // Create a store (container for documents) const fileStore = await ai.fileSearchStores.create({ config: { displayName: 'my-knowledge-base', // Human-readable name // Optional: Add store-level metadata customMetadata: { project: 'customer-support', environment: 'production' } } })

console.log('Created store:', fileStore.name) // Output: fileSearchStores/abc123xyz...

Finding Existing Stores:

// List all stores (paginated) const stores = await ai.fileSearchStores.list({ pageSize: 20 // Max 20 per page })

// Find by display name let targetStore = null let pageToken = null

do { const page = await ai.fileSearchStores.list({ pageToken }) targetStore = page.fileSearchStores.find( s => s.displayName === 'my-knowledge-base' ) pageToken = page.nextPageToken } while (!targetStore && pageToken)

if (targetStore) { console.log('Found existing store:', targetStore.name) } else { console.log('Store not found, creating new one...') }

Step 3: Upload Documents

Single File Upload:

const operation = await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream('./docs/manual.pdf'), config: { displayName: 'Installation Manual', customMetadata: { doc_type: 'manual', version: '1.0', language: 'en' }, chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 500, maxOverlapTokens: 50 } } } })

// Poll until indexing complete while (!operation.done) { await new Promise(resolve => setTimeout(resolve, 1000)) operation = await ai.operations.get({ name: operation.name }) }

console.log('✅ Indexed:', operation.response.displayName)

Batch Upload (Concurrent):

const filePaths = [ './docs/manual.pdf', './docs/faq.md', './docs/troubleshooting.docx' ]

// Upload all files concurrently const uploadPromises = filePaths.map(filePath => ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream(filePath), config: { displayName: filePath.split('/').pop(), customMetadata: { doc_type: 'support', source_path: filePath }, chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 500, maxOverlapTokens: 50 } } } }) )

const operations = await Promise.all(uploadPromises)

// Poll all operations for (const operation of operations) { let op = operation while (!op.done) { await new Promise(resolve => setTimeout(resolve, 1000)) op = await ai.operations.get({ name: op.name }) } console.log('✅ Indexed:', op.response.displayName) }

Step 4: Query with File Search

Basic Query:

const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'What are the safety precautions for installation?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }] } })

console.log('Answer:', response.text)

// Access citations const grounding = response.candidates[0].groundingMetadata if (grounding?.groundingChunks) { console.log('\nSources:') grounding.groundingChunks.forEach((chunk, i) => { console.log(${i + 1}. ${chunk.retrievedContext?.title || 'Unknown'}) console.log(URI: ${chunk.retrievedContext?.uri || 'N/A'}) }) }

Query with Metadata Filtering:

const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'How do I reset the device?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name], // Filter to only search troubleshooting docs in English, version 1.0 metadataFilter: 'doc_type="troubleshooting" AND language="en" AND version="1.0"' } }] } })

console.log('Answer:', response.text)

Metadata Filter Syntax:

AND: key1="value1" AND key2="value2" OR: key1="value1" OR key1="value2" Parentheses: (key1="a" OR key1="b") AND key2="c" Step 5: List and Manage Documents // List all documents in store const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name, pageSize: 20 })

console.log(Total documents: ${docs.documents?.length || 0})

docs.documents?.forEach(doc => { console.log(- ${doc.displayName} (${doc.name})) console.log(Metadata:, doc.customMetadata) })

// Get specific document details const docDetails = await ai.fileSearchStores.documents.get({ name: docs.documents[0].name })

console.log('Document details:', docDetails)

// Delete document await ai.fileSearchStores.documents.delete({ name: docs.documents[0].name, force: true })

Step 6: Cleanup // Delete entire store (force deletes all documents) await ai.fileSearchStores.delete({ name: fileStore.name, force: true })

console.log('✅ Store deleted')

Recommended Chunking Strategies

Chunking configuration significantly impacts retrieval quality. Adjust based on content type:

Technical Documentation chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 500, // Smaller chunks for precise code/API lookup maxOverlapTokens: 50 // 10% overlap } }

Best for: API docs, SDK references, code examples, configuration guides

Prose and Articles chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 800, // Larger chunks preserve narrative flow maxOverlapTokens: 80 // 10% overlap } }

Best for: Blog posts, news articles, product descriptions, marketing materials

Legal and Contracts chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 300, // Very small chunks for high precision maxOverlapTokens: 30 // 10% overlap } }

Best for: Legal documents, contracts, regulations, compliance docs

FAQ and Support chunkingConfig: { whiteSpaceConfig: { maxTokensPerChunk: 400, // Medium chunks (1-2 Q&A pairs) maxOverlapTokens: 40 // 10% overlap } }

Best for: FAQs, troubleshooting guides, how-to articles

General Rule: Maintain 10% overlap (overlap = chunk size / 10) to prevent context loss at chunk boundaries.

Metadata Best Practices

Design metadata schema for filtering and organization:

Example: Customer Support Knowledge Base customMetadata: { doc_type: 'faq' | 'manual' | 'troubleshooting' | 'guide', product: 'widget-pro' | 'widget-lite', version: '1.0' | '2.0', language: 'en' | 'es' | 'fr', category: 'installation' | 'configuration' | 'maintenance', priority: 'critical' | 'normal' | 'low', last_updated: '2025-01-15', author: 'support-team' }

Query Example:

metadataFilter: 'product="widget-pro" AND (doc_type="troubleshooting" OR doc_type="faq") AND language="en"'

Example: Legal Document Repository customMetadata: { doc_type: 'contract' | 'regulation' | 'case-law' | 'policy', jurisdiction: 'US' | 'EU' | 'UK', practice_area: 'employment' | 'corporate' | 'ip' | 'tax', effective_date: '2025-01-01', status: 'active' | 'archived', confidentiality: 'public' | 'internal' | 'privileged' }

Example: Code Documentation customMetadata: { doc_type: 'api-reference' | 'tutorial' | 'example' | 'changelog', language: 'javascript' | 'python' | 'java' | 'go', framework: 'react' | 'nextjs' | 'express' | 'fastapi', version: '1.2.0', difficulty: 'beginner' | 'intermediate' | 'advanced' }

Tips:

Use consistent key naming (snake_case or camelCase) Limit to most important filterable fields (20 max) Use enums/constants for values (easier filtering) Include version and date fields for time-based filtering Cost Optimization 1. Deduplicate Before Upload // Track uploaded file hashes to avoid duplicates const uploadedHashes = new Set()

async function uploadWithDeduplication(filePath: string) { const fileHash = await getFileHash(filePath)

if (uploadedHashes.has(fileHash)) { console.log(Skipping duplicate: ${filePath}) return }

await ai.fileSearchStores.uploadToFileSearchStore({ name: fileStore.name, file: fs.createReadStream(filePath) })

uploadedHashes.add(fileHash) }

  1. Compress Large Files // Convert images to text before indexing (OCR) // Compress PDFs (remove images, use text-only) // Use markdown instead of Word docs (smaller size)

  2. Use Metadata Filtering to Reduce Query Scope // ❌ EXPENSIVE: Search all 10GB of documents const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'Reset procedure?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }] } })

// ✅ CHEAPER: Filter to only troubleshooting docs (subset) const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'Reset procedure?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name], metadataFilter: 'doc_type="troubleshooting"' // Reduces search scope } }] } })

  1. Choose Flash Over Pro for Cost Savings // Gemini 3 Flash is 10x cheaper than Pro for queries // Use Flash unless you need Pro's advanced reasoning

// Development/testing: Use Flash model: 'gemini-3-flash'

// Production (high-stakes answers): Use Pro model: 'gemini-3-pro'

  1. Monitor Storage Usage // List stores and estimate storage const stores = await ai.fileSearchStores.list()

for (const store of stores.fileSearchStores || []) { const docs = await ai.fileSearchStores.documents.list({ parent: store.name })

console.log(Store: ${store.displayName}) console.log(Documents: ${docs.documents?.length || 0}) // Estimate storage (3x input size) console.log(Estimated storage: ~${(docs.documents?.length || 0) * 10} MB) }

Testing & Verification Verify Store Creation const store = await ai.fileSearchStores.get({ name: fileStore.name })

console.assert(store.displayName === 'my-knowledge-base', 'Store name mismatch') console.log('✅ Store created successfully')

Verify Document Indexing const docs = await ai.fileSearchStores.documents.list({ parent: fileStore.name })

console.assert(docs.documents?.length > 0, 'No documents indexed') console.log(✅ ${docs.documents?.length} documents indexed)

Verify Query Functionality const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'What is this knowledge base about?', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }] } })

console.assert(response.text.length > 0, 'Empty response') console.log('✅ Query successful:', response.text.substring(0, 100) + '...')

Verify Citations const response = await ai.models.generateContent({ model: 'gemini-3-flash', contents: 'Provide a specific answer with citations.', config: { tools: [{ fileSearch: { fileSearchStoreNames: [fileStore.name] } }] } })

const grounding = response.candidates[0].groundingMetadata console.assert( grounding?.groundingChunks?.length > 0, 'No grounding/citations returned' ) console.log(✅ ${grounding?.groundingChunks?.length} citations returned)

Integration Examples Streaming Support

File Search supports streaming responses with generateContentStream():

// ✅ Streaming works with File Search (v1.34.0+) const stream = await ai.models.generateContentStream({ model: 'gemini-3-flash', contents: 'Summarize the document', config: { tools: [{ fileSearch: { fileSearchStoreNames: [storeName] } }] } })

for await (const chunk of stream) { process.stdout.write(chunk.text) }

// Access grounding after stream completes const grounding = stream.candidates[0].groundingMetadata

Note: Early SDK versions (pre-v1.34.0) may have had streaming issues. Use v1.34.0+ for reliable streaming support.

Source: GitHub Issue #1221

Working Templates

This skill includes 3 working templates in the templates/ directory:

Template 1: basic-node-rag

Minimal Node.js/TypeScript example demonstrating:

Create file search store Upload multiple documents Query with natural language Display citations

Use when: Learning File Search, prototyping, simple CLI tools

Run:

cd templates/basic-node-rag npm install npm run dev

Template 2: cloudflare-worker-rag

Cloudflare Workers integration showing:

Edge API for document upload Edge API for semantic search Integration with R2 for document storage Hybrid architecture (Gemini File Search + Cloudflare edge)

Use when: Building global edge applications, integrating with Cloudflare stack

Deploy:

cd templates/cloudflare-worker-rag npm install npx wrangler deploy

Template 3: nextjs-docs-search

Full-stack Next.js application featuring:

Document upload UI with drag-and-drop Real-time search interface Citation rendering with source links Metadata filtering UI

Use when: Building production documentation sites, knowledge bases

Run:

cd templates/nextjs-docs-search npm install npm run dev

References

Official Documentation:

File Search Overview: https://ai.google.dev/gemini-api/docs/file-search API Reference (Stores): https://ai.google.dev/api/file-search/file-search-stores API Reference (Documents): https://ai.google.dev/api/file-search/documents Blog Announcement: https://blog.google/technology/developers/file-search-gemini-api/ Pricing: https://ai.google.dev/pricing

Tutorials:

JavaScript/TypeScript Guide: https://www.philschmid.de/gemini-file-search-javascript SDK Repository: https://github.com/googleapis/js-genai

Bundled Resources in This Skill:

references/api-reference.md - Complete API documentation references/chunking-best-practices.md - Detailed chunking strategies references/pricing-calculator.md - Cost estimation guide references/migration-from-openai.md - Migration guide from OpenAI Files API scripts/create-store.ts - CLI tool to create stores scripts/upload-batch.ts - Batch upload script scripts/query-store.ts - Interactive query tool scripts/cleanup.ts - Cleanup script

Working Templates:

templates/basic-node-rag/ - Minimal Node.js example templates/cloudflare-worker-rag/ - Edge deployment example templates/nextjs-docs-search/ - Full-stack Next.js app

Skill Version: 1.1.0 Last Verified: 2026-01-21 Package Version: @google/genai ^1.38.0 (minimum 1.29.0 required) Token Savings: ~67% Errors Prevented: 12 Changes: Added 4 new errors from community research (displayName Blob issue, grounding with JSON mode, tool conflicts, batch API metadata), enhanced polling timeout pattern with fallback verification, added streaming support note

返回排行榜