are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.
Authentication & Configuration
Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.
Application Default Credentials (ADC)
Set these variables for standard
Google Cloud authentication
:
export
GOOGLE_CLOUD_PROJECT
=
'your-project-id'
export
GOOGLE_CLOUD_LOCATION
=
'global'
export
GOOGLE_GENAI_USE_VERTEXAI
=
true
By default, use
location="global"
to access the global endpoint, which provides automatic routing to regions with available capacity.
If a user explicitly asks to use a specific region (e.g.,
us-central1
,
europe-west4
), specify that region in the
GOOGLE_CLOUD_LOCATION
parameter instead. Reference the
supported regions documentation
if needed.
Vertex AI in Express Mode
Set these variables when using
Express Mode
with an API key:
export
GOOGLE_API_KEY
=
'your-api-key'
export
GOOGLE_GENAI_USE_VERTEXAI
=
true
Initialization
Initialize the client without arguments to pick up environment variables:
from
google
import
genai
client
=
genai
.
Client
(
)
Alternatively, you can hard-code in parameters when creating the client.
from
google
import
genai
client
=
genai
.
Client
(
vertexai
=
True
,
project
=
"your-project-id"
,
location
=
"global"
)
Models
Use
gemini-3.1-pro-preview
for complex reasoning, coding, research (1M tokens)
Use
gemini-3-flash-preview
for fast, balanced performance, multimodal (1M tokens)
Use
gemini-3-pro-image-preview
for Nano Banana Pro image generation and editing
Use
gemini-live-2.5-flash-native-audio
for Live Realtime API including native audio
Use the following models if explicitly requested:
Use
gemini-2.5-flash-image
for Nano Banana image generation and editing
Use
gemini-2.5-flash
Use
gemini-2.5-flash-lite
Use
gemini-2.5-pro
[!IMPORTANT]
Models like
gemini-2.0-*
,
gemini-1.5-*
,
gemini-1.0-*
,
gemini-pro
are legacy and deprecated. Use the new models above. Your knowledge is outdated.
For production environments, consult the Vertex AI documentation for stable model versions (e.g.
gemini-3-flash
).
Quick Start
Python
from
google
import
genai
client
=
genai
.
Client
(
)
response
=
client
.
models
.
generate_content
(
model
=
"gemini-3-flash-preview"
,
contents
=
"Explain quantum computing"
)
print
(
response
.
text
)
TypeScript/JavaScript
import
{
GoogleGenAI
}
from
"@google/genai"
;
const
ai
=
new
GoogleGenAI
(
{
vertexai
:
{
project
:
"your-project-id"
,
location
:
"global"
}
}
)
;
const
response
=
await
ai
.
models
.
generateContent
(
{
model
:
"gemini-3-flash-preview"
,
contents
:
"Explain quantum computing"
}
)
;
console
.
log
(
response
.
text
)
;
Go
package
main
import
(
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func
main
(
)
{
ctx
:=
context
.
Background
(
)
client
,
err
:=
genai
.
NewClient
(
ctx
,
&
genai
.
ClientConfig
{
Backend
:
genai
.
BackendVertexAI
,
Project
:
"your-project-id"
,
Location
:
"global"
,
}
)
if
err
!=
nil
{
log
.
Fatal
(
err
)
}
resp
,
err
:=
client
.
Models
.
GenerateContent
(
ctx
,
"gemini-3-flash-preview"
,
genai
.
Text
(
"Explain quantum computing"
)
,
nil
)
if
err
!=
nil
{
log
.
Fatal
(
err
)
}
fmt
.
Println
(
resp
.
Text
)
}
Java
import
com
.
google
.
genai
.
Client
;
import
com
.
google
.
genai
.
types
.
GenerateContentResponse
;
public
class
GenerateTextFromTextInput
{
public
static
void
main
(
String
[
]
args
)
{
Client
client
=
Client
.
builder
(
)
.
vertexAi
(
true
)
.
project
(
"your-project-id"
)
.
location
(
"global"
)
.
build
(
)
;
GenerateContentResponse
response
=
client
.
models
.
generateContent
(
"gemini-3-flash-preview"
,
"Explain quantum computing"
,
null
)
;
System
.
out
.
println
(
response
.
text
(
)
)
;
}
}
C#/.NET
using
Google
.
GenAI
;
var
client
=
new
Client
(
project
:
"your-project-id"
,
location
:
"global"
,
vertexAI
:
true
)
;
var
response
=
await
client
.
Models
.
GenerateContent
(
"gemini-3-flash-preview"
,
"Explain quantum computing"
)
;
Console
.
WriteLine
(
response
.
Text
)
;
API spec & Documentation (source of truth)
When implementing or debugging API integration for Vertex AI, refer to the official Google Cloud Vertex AI documentation:
tools are available, use them to find and retrieve official documentation for Google Cloud and Vertex AI directly within the context. This is the preferred method for getting up-to-date API details and code snippets.
Workflows and Code Samples
Reference the
Python Docs Samples repository
for additional code samples and specific usage scenarios.
Depending on the specific user request, refer to the following reference files for detailed code samples and usage patterns (Python examples):
Text & Multimodal
Chat, Multimodal inputs (Image, Video, Audio), and Streaming. See
references/text_and_multimodal.md
Embeddings
Generate text embeddings for semantic search. See
references/embeddings.md
Structured Output & Tools
JSON generation, Function Calling, Search Grounding, and Code Execution. See
references/structured_and_tools.md
Media Generation
Image generation, Image editing, and Video generation. See
references/media_generation.md
Bounding Box Detection
Object detection and localization within images and video. See
references/bounding_box.md
Live API
Real-time bidirectional streaming for voice, vision, and text. See
references/live_api.md
Advanced Features
Content Caching, Batch Prediction, and Thinking/Reasoning. See
references/advanced_features.md
Safety
Adjusting Responsible AI filters and thresholds. See
references/safety.md
Model Tuning
Supervised Fine-Tuning and Preference Tuning. See
references/model_tuning.md