Gemini 3.1 Pro API: Developer Guide with Code Examples (2026)
Complete developer guide to the Gemini 3.1 Pro API. Covers model IDs (gemini-3.1-pro-preview-customtools), pricing, code examples in Python and JavaScript, custom tools, function calling, and how to integrate with your app.
TL;DR
| Gemini 3.1 Pro | |
|---|---|
| Model IDs | gemini-3.1-pro, gemini-3.1-pro-preview-customtools |
| Context window | 1M tokens |
| Input price | $2/1M tokens |
| Output price | $12/1M tokens |
| Key features | Custom tools, function calling, grounding, multimodal (text + image + audio + video) |
| API | Google AI Studio / Vertex AI |
Gemini 3.1 Pro is Google's latest frontier model, released March 2026. It's the cheapest frontier API per token, has native 1M context, and introduces custom tools — a new way to give the model access to external functions with structured schemas.
Model IDs
Google offers two variants of Gemini 3.1 Pro:
| Model ID | Description | Status |
|---|---|---|
gemini-3.1-pro | Stable release, general availability | GA |
gemini-3.1-pro-preview-customtools | Preview with enhanced custom tools support | Preview |
The customtools preview variant has improved reliability for complex function calling chains — use it if your app makes heavy use of tool calling. For general use, the stable gemini-3.1-pro is recommended.
# Google AI Studio
model = "gemini-3.1-pro"
# Vertex AI
model = "gemini-3.1-pro@001"
Quick Start: Python
Installation
pip install google-genai
Basic Text Generation
from google import genai
client = genai.Client(api_key="YOUR_API_KEY")
response = client.models.generate_content(
model="gemini-3.1-pro",
contents="Explain quantum computing in 3 sentences."
)
print(response.text)
Streaming
for chunk in client.models.generate_content_stream(
model="gemini-3.1-pro",
contents="Write a Python function to merge two sorted arrays."
):
print(chunk.text, end="")
Quick Start: JavaScript
Installation
npm install @google/genai
Basic Text Generation
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });
const response = await ai.models.generateContent({
model: "gemini-3.1-pro",
contents: "Explain quantum computing in 3 sentences.",
});
console.log(response.text);
Streaming
const stream = await ai.models.generateContentStream({
model: "gemini-3.1-pro",
contents: "Write a JavaScript function to merge two sorted arrays.",
});
for await (const chunk of stream) {
process.stdout.write(chunk.text);
}
Be first to build with AI
Y Build is the AI-era operating system for startups. Join the waitlist and get early access.
Pricing
Gemini 3.1 Pro is the cheapest frontier model API as of March 2026.
| Gemini 3.1 Pro | GPT-5.2 | Claude Sonnet 4.6 | |
|---|---|---|---|
| Input | $2/1M | $5/1M | $3/1M |
| Output | $12/1M | $15/1M | $15/1M |
| Context | 1M | 400K | 1M (beta) |
| Cost per 100K in + 20K out | $0.44 | $0.80 | $0.60 |
At scale, Gemini 3.1 Pro costs roughly 45% less than GPT-5.2 and 27% less than Sonnet 4.6 per session.
Free Tier
Google AI Studio offers a free tier:
- 60 requests per minute
- 1M tokens per minute
- No credit card required
This is the most generous free API tier among the three major providers.
Key Features
1M Token Context Window
Gemini 3.1 Pro natively supports 1 million tokens of context — enough for:
- ~700,000 words of text
- ~30,000 lines of code
- ~1 hour of video
- ~11 hours of audio
Unlike competing models that offer extended context as a beta feature, Gemini's 1M context is fully GA and priced the same as standard context.
Custom Tools (Function Calling)
Custom tools let you define external functions that Gemini can call during generation. The model decides when to call a tool, structures the arguments, and incorporates the result into its response.
This is what the gemini-3.1-pro-preview-customtools variant is optimized for.
Grounding with Google Search
Gemini can ground its responses in real-time Google Search results. Enable grounding to reduce hallucinations and ensure the model uses current information.
Native Multimodal
Process text, images, audio, and video in a single request. No separate vision or audio models — Gemini handles all modalities natively.
Code Example: Custom Tools / Function Calling
This example creates a weather tool that Gemini can call to get current conditions.
Python
from google import genai
from google.genai import types
client = genai.Client(api_key="YOUR_API_KEY")
# Define the tool
weather_tool = types.Tool(
function_declarations=[
types.FunctionDeclaration(
name="get_weather",
description="Get the current weather for a city",
parameters=types.Schema(
type=types.Type.OBJECT,
properties={
"city": types.Schema(
type=types.Type.STRING,
description="City name, e.g. 'San Francisco'"
),
"unit": types.Schema(
type=types.Type.STRING,
enum=["celsius", "fahrenheit"],
description="Temperature unit"
),
},
required=["city"],
),
)
]
)
# Send request with tool
response = client.models.generate_content(
model="gemini-3.1-pro-preview-customtools",
contents="What's the weather like in Tokyo?",
config=types.GenerateContentConfig(
tools=[weather_tool],
),
)
# Check if the model wants to call a function
for part in response.candidates[0].content.parts:
if part.function_call:
print(f"Function: {part.function_call.name}")
print(f"Arguments: {part.function_call.args}")
# Output:
# Function: get_weather
# Arguments: {'city': 'Tokyo', 'unit': 'celsius'}
# In production, you'd call your actual weather API here,
# then send the result back to Gemini for a natural language response.
JavaScript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });
const weatherTool = {
functionDeclarations: [
{
name: "get_weather",
description: "Get the current weather for a city",
parameters: {
type: "OBJECT",
properties: {
city: {
type: "STRING",
description: "City name, e.g. 'San Francisco'",
},
unit: {
type: "STRING",
enum: ["celsius", "fahrenheit"],
description: "Temperature unit",
},
},
required: ["city"],
},
},
],
};
const response = await ai.models.generateContent({
model: "gemini-3.1-pro-preview-customtools",
contents: "What's the weather like in Tokyo?",
config: {
tools: [weatherTool],
},
});
// Check for function calls in the response
for (const part of response.candidates[0].content.parts) {
if (part.functionCall) {
console.log(`Function: ${part.functionCall.name}`);
console.log(`Arguments:`, part.functionCall.args);
}
}
Code Example: Multimodal (Image + Text)
Python
from google import genai
from google.genai import types
import base64
client = genai.Client(api_key="YOUR_API_KEY")
# Read a local image
with open("screenshot.png", "rb") as f:
image_data = f.read()
response = client.models.generate_content(
model="gemini-3.1-pro",
contents=[
types.Content(
parts=[
types.Part(text="What's in this screenshot? Describe the UI elements."),
types.Part(
inline_data=types.Blob(
mime_type="image/png",
data=image_data,
)
),
]
)
],
)
print(response.text)
JavaScript
import { GoogleGenAI } from "@google/genai";
import fs from "fs";
const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });
const imageData = fs.readFileSync("screenshot.png");
const base64Image = imageData.toString("base64");
const response = await ai.models.generateContent({
model: "gemini-3.1-pro",
contents: [
{
parts: [
{ text: "What's in this screenshot? Describe the UI elements." },
{
inlineData: {
mimeType: "image/png",
data: base64Image,
},
},
],
},
],
});
console.log(response.text);
API Comparison: Gemini 3.1 Pro vs GPT-5.2 vs Claude Sonnet 4.6
| Feature | Gemini 3.1 Pro | GPT-5.2 | Claude Sonnet 4.6 |
|---|---|---|---|
| Input price | $2/1M | $5/1M | $3/1M |
| Output price | $12/1M | $15/1M | $15/1M |
| Context window | 1M (GA) | 400K | 1M (beta) |
| Function calling | Yes (custom tools) | Yes | Yes (tool use) |
| Multimodal | Text + image + audio + video | Text + image + audio | Text + image |
| Grounding | Google Search | Web browsing | No native grounding |
| Streaming | Yes | Yes | Yes |
| Batch API | Yes | Yes | Yes |
| Free tier | 60 RPM, 1M TPM | Limited | Limited |
| SDK languages | Python, JS, Go, Dart, Swift | Python, JS | Python, JS |
| Coding (SWE-bench) | 76.8% | 80.0% | 79.6% |
| Computer use | N/A | 38.2% | 72.5% |
| Math (AIME) | ~88% | 100% | ~90% |
When to Choose Each API
Choose Gemini 3.1 Pro when:- Cost is a primary concern (cheapest frontier API)
- You need native video or audio processing
- You need 1M context in production (GA, not beta)
- You want Google Search grounding
- You're building on Google Cloud
- Math-heavy reasoning is critical
- You're in the OpenAI ecosystem
- You need structured outputs with guaranteed JSON schemas
- Speed on simple queries matters most
- Coding and agentic tasks are the primary use case
- You need computer use / browser automation
- Office productivity tasks (documents, spreadsheets)
- Prompt injection resistance matters (agent safety)
Integrating Gemini 3.1 Pro with Your App
Using with Y Build
If you're building a product with Y Build, you can integrate the Gemini API directly into your backend. Y Build projects deploy to Cloudflare Workers, which can call the Gemini API with low latency.
// In a Y Build project (Cloudflare Worker)
export async function onRequest(context) {
const response = await fetch(
"https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-pro:generateContent",
{
method: "POST",
headers: {
"Content-Type": "application/json",
"x-goog-api-key": context.env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: "Your prompt here" }] }],
}),
}
);
const data = await response.json();
return new Response(JSON.stringify(data));
}
Rate Limits
| Tier | Requests/min | Tokens/min |
|---|---|---|
| Free | 60 | 1,000,000 |
| Pay-as-you-go | 1,000 | 4,000,000 |
| Enterprise | Custom | Custom |
Frequently Asked Questions
What is gemini-3.1-pro-preview-customtools?
It's a preview variant of Gemini 3.1 Pro optimized for custom tools and function calling. It has improved reliability when the model needs to chain multiple tool calls together. Use it if your app relies heavily on function calling. For general text generation, use the stable gemini-3.1-pro model ID.
Is Gemini 3.1 Pro better than GPT-5.2?
It depends on the task. Gemini 3.1 Pro is cheaper, has a larger context window, and supports more modalities (video, audio). GPT-5.2 scores higher on coding benchmarks and math reasoning. For multimodal apps on a budget, Gemini wins. For pure reasoning tasks, GPT-5.2 leads.
How does Gemini 3.1 Pro compare to Claude Sonnet 4.6?
Gemini is cheaper ($2/$12 vs $3/$15 per million tokens) and has native video/audio support. Claude Sonnet 4.6 is better at coding (79.6% vs 76.8% on SWE-bench), computer use (72.5% vs N/A), and office tasks. Choose Gemini for multimodal and cost. Choose Claude for coding and agents.
Can I use Gemini 3.1 Pro for free?
Yes. Google AI Studio provides a free tier with 60 requests per minute and 1 million tokens per minute. No credit card required. This is sufficient for development, testing, and low-traffic production apps.
What's the difference between Google AI Studio and Vertex AI?
Google AI Studio is the simpler, developer-focused API — sign up with an API key and start making calls. Vertex AI is the enterprise platform — runs on Google Cloud, offers fine-tuning, model deployment, monitoring, and SLAs. Same model, different wrappers. Start with AI Studio, move to Vertex AI when you need enterprise features.
The Bottom Line
Gemini 3.1 Pro is the best value frontier API in March 2026. At $2/$12 per million tokens, it's roughly half the cost of GPT-5.2 and a third less than Claude Sonnet 4.6 — with native 1M context and the broadest multimodal support.
For developers building AI-powered products, the practical advice is: use Gemini for multimodal and cost-sensitive tasks, Claude for coding and agents, and GPT-5.2 for math-heavy reasoning. Model routing across all three gives you the best of each.
Building an AI-powered product? Y Build handles the full stack — AI-assisted coding, one-click deploy to Cloudflare, Demo Cut for product videos, AI SEO, and analytics. Integrate Gemini, Claude, or GPT APIs into your app and ship in hours. Start free.
Sources:
Be first to build with AI
Y Build is the AI-era operating system for startups. Join the waitlist and get early access.