March 18, 2026·Y Build Team

Gemini 3.1 Pro API: Developer Guide with Code Examples (2026)

Complete developer guide to the Gemini 3.1 Pro API. Covers model IDs (gemini-3.1-pro-preview-customtools), pricing, code examples in Python and JavaScript, custom tools, function calling, and how to integrate with your app.

GeminiGoogleAPIDeveloper GuidePythonJavaScriptAIFunction Calling2026

TL;DR

Gemini 3.1 Pro
Model IDs	`gemini-3.1-pro`, `gemini-3.1-pro-preview-customtools`
Context window	1M tokens
Input price	$2/1M tokens
Output price	$12/1M tokens
Key features	Custom tools, function calling, grounding, multimodal (text + image + audio + video)
API	Google AI Studio / Vertex AI

Gemini 3.1 Pro is Google's latest frontier model, released March 2026. It's the cheapest frontier API per token, has native 1M context, and introduces custom tools — a new way to give the model access to external functions with structured schemas.

Model IDs

Google offers two variants of Gemini 3.1 Pro:

Model ID	Description	Status
`gemini-3.1-pro`	Stable release, general availability	GA
`gemini-3.1-pro-preview-customtools`	Preview with enhanced custom tools support	Preview

The customtools preview variant has improved reliability for complex function calling chains — use it if your app makes heavy use of tool calling. For general use, the stable gemini-3.1-pro is recommended.

# Google AI Studio
model = "gemini-3.1-pro"

# Vertex AI
model = "gemini-3.1-pro@001"

Quick Start: Python

Installation

bash

pip install google-genai

Basic Text Generation

python

from google import genai

client = genai.Client(api_key="YOUR_API_KEY")

response = client.models.generate_content(
    model="gemini-3.1-pro",
    contents="Explain quantum computing in 3 sentences."
)

print(response.text)

Streaming

python

for chunk in client.models.generate_content_stream(
    model="gemini-3.1-pro",
    contents="Write a Python function to merge two sorted arrays."
):
    print(chunk.text, end="")

Quick Start: JavaScript

Installation

bash

npm install @google/genai

Basic Text Generation

javascript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro",
  contents: "Explain quantum computing in 3 sentences.",
});

console.log(response.text);

Streaming

javascript

const stream = await ai.models.generateContentStream({
  model: "gemini-3.1-pro",
  contents: "Write a JavaScript function to merge two sorted arrays.",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}

Early Access

Be first to build with AI

Y Build is the AI-era operating system for startups. Join the waitlist and get early access.

Pricing

Gemini 3.1 Pro is the cheapest frontier model API as of March 2026.

Gemini 3.1 Pro	GPT-5.2	Claude Sonnet 4.6
Input	$2/1M	$5/1M	$3/1M
Output	$12/1M	$15/1M	$15/1M
Context	1M	400K	1M (beta)
Cost per 100K in + 20K out	$0.44	$0.80	$0.60

At scale, Gemini 3.1 Pro costs roughly 45% less than GPT-5.2 and 27% less than Sonnet 4.6 per session.

Free Tier

Google AI Studio offers a free tier:

60 requests per minute

1M tokens per minute

No credit card required

This is the most generous free API tier among the three major providers.

Key Features

1M Token Context Window

Gemini 3.1 Pro natively supports 1 million tokens of context — enough for:

~700,000 words of text

~30,000 lines of code

~1 hour of video

~11 hours of audio

Unlike competing models that offer extended context as a beta feature, Gemini's 1M context is fully GA and priced the same as standard context.

Custom Tools (Function Calling)

Custom tools let you define external functions that Gemini can call during generation. The model decides when to call a tool, structures the arguments, and incorporates the result into its response.

This is what the gemini-3.1-pro-preview-customtools variant is optimized for.

Grounding with Google Search

Gemini can ground its responses in real-time Google Search results. Enable grounding to reduce hallucinations and ensure the model uses current information.

Native Multimodal

Process text, images, audio, and video in a single request. No separate vision or audio models — Gemini handles all modalities natively.

Code Example: Custom Tools / Function Calling

This example creates a weather tool that Gemini can call to get current conditions.

Python

python

from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_API_KEY")

# Define the tool
weather_tool = types.Tool(
    function_declarations=[
        types.FunctionDeclaration(
            name="get_weather",
            description="Get the current weather for a city",
            parameters=types.Schema(
                type=types.Type.OBJECT,
                properties={
                    "city": types.Schema(
                        type=types.Type.STRING,
                        description="City name, e.g. 'San Francisco'"
                    ),
                    "unit": types.Schema(
                        type=types.Type.STRING,
                        enum=["celsius", "fahrenheit"],
                        description="Temperature unit"
                    ),
                },
                required=["city"],
            ),
        )
    ]
)

# Send request with tool
response = client.models.generate_content(
    model="gemini-3.1-pro-preview-customtools",
    contents="What's the weather like in Tokyo?",
    config=types.GenerateContentConfig(
        tools=[weather_tool],
    ),
)

# Check if the model wants to call a function
for part in response.candidates[0].content.parts:
    if part.function_call:
        print(f"Function: {part.function_call.name}")
        print(f"Arguments: {part.function_call.args}")
        # Output:
        # Function: get_weather
        # Arguments: {'city': 'Tokyo', 'unit': 'celsius'}

        # In production, you'd call your actual weather API here,
        # then send the result back to Gemini for a natural language response.

JavaScript

javascript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });

const weatherTool = {
  functionDeclarations: [
    {
      name: "get_weather",
      description: "Get the current weather for a city",
      parameters: {
        type: "OBJECT",
        properties: {
          city: {
            type: "STRING",
            description: "City name, e.g. 'San Francisco'",
          },
          unit: {
            type: "STRING",
            enum: ["celsius", "fahrenheit"],
            description: "Temperature unit",
          },
        },
        required: ["city"],
      },
    },
  ],
};

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro-preview-customtools",
  contents: "What's the weather like in Tokyo?",
  config: {
    tools: [weatherTool],
  },
});

// Check for function calls in the response
for (const part of response.candidates[0].content.parts) {
  if (part.functionCall) {
    console.log(`Function: ${part.functionCall.name}`);
    console.log(`Arguments:`, part.functionCall.args);
  }
}

Code Example: Multimodal (Image + Text)

Python

python

from google import genai
from google.genai import types
import base64

client = genai.Client(api_key="YOUR_API_KEY")

# Read a local image
with open("screenshot.png", "rb") as f:
    image_data = f.read()

response = client.models.generate_content(
    model="gemini-3.1-pro",
    contents=[
        types.Content(
            parts=[
                types.Part(text="What's in this screenshot? Describe the UI elements."),
                types.Part(
                    inline_data=types.Blob(
                        mime_type="image/png",
                        data=image_data,
                    )
                ),
            ]
        )
    ],
)

print(response.text)

JavaScript

javascript

import { GoogleGenAI } from "@google/genai";
import fs from "fs";

const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });

const imageData = fs.readFileSync("screenshot.png");
const base64Image = imageData.toString("base64");

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro",
  contents: [
    {
      parts: [
        { text: "What's in this screenshot? Describe the UI elements." },
        {
          inlineData: {
            mimeType: "image/png",
            data: base64Image,
          },
        },
      ],
    },
  ],
});

console.log(response.text);

API Comparison: Gemini 3.1 Pro vs GPT-5.2 vs Claude Sonnet 4.6

Feature	Gemini 3.1 Pro	GPT-5.2	Claude Sonnet 4.6
Input price	$2/1M	$5/1M	$3/1M
Output price	$12/1M	$15/1M	$15/1M
Context window	1M (GA)	400K	1M (beta)
Function calling	Yes (custom tools)	Yes	Yes (tool use)
Multimodal	Text + image + audio + video	Text + image + audio	Text + image
Grounding	Google Search	Web browsing	No native grounding
Streaming	Yes	Yes	Yes
Batch API	Yes	Yes	Yes
Free tier	60 RPM, 1M TPM	Limited	Limited
SDK languages	Python, JS, Go, Dart, Swift	Python, JS	Python, JS
Coding (SWE-bench)	76.8%	80.0%	79.6%
Computer use	N/A	38.2%	72.5%
Math (AIME)	~88%	100%	~90%

When to Choose Each API

Choose Gemini 3.1 Pro when:

Cost is a primary concern (cheapest frontier API)
You need native video or audio processing
You need 1M context in production (GA, not beta)
You want Google Search grounding
You're building on Google Cloud

Choose GPT-5.2 when:

Math-heavy reasoning is critical
You're in the OpenAI ecosystem
You need structured outputs with guaranteed JSON schemas
Speed on simple queries matters most

Choose Claude Sonnet 4.6 when:

Coding and agentic tasks are the primary use case
You need computer use / browser automation
Office productivity tasks (documents, spreadsheets)
Prompt injection resistance matters (agent safety)

Integrating Gemini 3.1 Pro with Your App

Using with Y Build

If you're building a product with Y Build, you can integrate the Gemini API directly into your backend. Y Build projects deploy to Cloudflare Workers, which can call the Gemini API with low latency.

javascript

// In a Y Build project (Cloudflare Worker)
export async function onRequest(context) {
  const response = await fetch(
    "https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-pro:generateContent",
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "x-goog-api-key": context.env.GEMINI_API_KEY,
      },
      body: JSON.stringify({
        contents: [{ parts: [{ text: "Your prompt here" }] }],
      }),
    }
  );

  const data = await response.json();
  return new Response(JSON.stringify(data));
}

Rate Limits

Tier	Requests/min	Tokens/min
Free	60	1,000,000
Pay-as-you-go	1,000	4,000,000
Enterprise	Custom	Custom

Frequently Asked Questions

What is gemini-3.1-pro-preview-customtools?

It's a preview variant of Gemini 3.1 Pro optimized for custom tools and function calling. It has improved reliability when the model needs to chain multiple tool calls together. Use it if your app relies heavily on function calling. For general text generation, use the stable gemini-3.1-pro model ID.

Is Gemini 3.1 Pro better than GPT-5.2?

It depends on the task. Gemini 3.1 Pro is cheaper, has a larger context window, and supports more modalities (video, audio). GPT-5.2 scores higher on coding benchmarks and math reasoning. For multimodal apps on a budget, Gemini wins. For pure reasoning tasks, GPT-5.2 leads.

How does Gemini 3.1 Pro compare to Claude Sonnet 4.6?

Gemini is cheaper ($2/$12 vs $3/$15 per million tokens) and has native video/audio support. Claude Sonnet 4.6 is better at coding (79.6% vs 76.8% on SWE-bench), computer use (72.5% vs N/A), and office tasks. Choose Gemini for multimodal and cost. Choose Claude for coding and agents.

Can I use Gemini 3.1 Pro for free?

Yes. Google AI Studio provides a free tier with 60 requests per minute and 1 million tokens per minute. No credit card required. This is sufficient for development, testing, and low-traffic production apps.

What's the difference between Google AI Studio and Vertex AI?

Google AI Studio is the simpler, developer-focused API — sign up with an API key and start making calls. Vertex AI is the enterprise platform — runs on Google Cloud, offers fine-tuning, model deployment, monitoring, and SLAs. Same model, different wrappers. Start with AI Studio, move to Vertex AI when you need enterprise features.

The Bottom Line

Gemini 3.1 Pro is the best value frontier API in March 2026. At $2/$12 per million tokens, it's roughly half the cost of GPT-5.2 and a third less than Claude Sonnet 4.6 — with native 1M context and the broadest multimodal support.

For developers building AI-powered products, the practical advice is: use Gemini for multimodal and cost-sensitive tasks, Claude for coding and agents, and GPT-5.2 for math-heavy reasoning. Model routing across all three gives you the best of each.

Building an AI-powered product? Y Build handles the full stack — AI-assisted coding, one-click deploy to Cloudflare, Demo Cut for product videos, AI SEO, and analytics. Integrate Gemini, Claude, or GPT APIs into your app and ship in hours. Start free.

Sources:

Early Access

Be first to build with AI

Y Build is the AI-era operating system for startups. Join the waitlist and get early access.

Back to blog

March 18, 2026·Y Build Team

Gemini 3.1 Pro API: Developer Guide with Code Examples (2026)

GeminiGoogleAPIDeveloper GuidePythonJavaScriptAIFunction Calling2026

TL;DR

Gemini 3.1 Pro
Model IDs	`gemini-3.1-pro`, `gemini-3.1-pro-preview-customtools`
Context window	1M tokens
Input price	$2/1M tokens
Output price	$12/1M tokens
Key features	Custom tools, function calling, grounding, multimodal (text + image + audio + video)
API	Google AI Studio / Vertex AI

Model IDs

Google offers two variants of Gemini 3.1 Pro:

Model ID	Description	Status
`gemini-3.1-pro`	Stable release, general availability	GA
`gemini-3.1-pro-preview-customtools`	Preview with enhanced custom tools support	Preview

# Google AI Studio
model = "gemini-3.1-pro"

# Vertex AI
model = "gemini-3.1-pro@001"

Quick Start: Python

Installation

bash

pip install google-genai

Basic Text Generation

python

from google import genai

client = genai.Client(api_key="YOUR_API_KEY")

response = client.models.generate_content(
    model="gemini-3.1-pro",
    contents="Explain quantum computing in 3 sentences."
)

print(response.text)

Streaming

python

for chunk in client.models.generate_content_stream(
    model="gemini-3.1-pro",
    contents="Write a Python function to merge two sorted arrays."
):
    print(chunk.text, end="")

Quick Start: JavaScript

Installation

bash

npm install @google/genai

Basic Text Generation

javascript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro",
  contents: "Explain quantum computing in 3 sentences.",
});

console.log(response.text);

Streaming

javascript

const stream = await ai.models.generateContentStream({
  model: "gemini-3.1-pro",
  contents: "Write a JavaScript function to merge two sorted arrays.",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}

Early Access

Be first to build with AI

Y Build is the AI-era operating system for startups. Join the waitlist and get early access.

Pricing

Gemini 3.1 Pro is the cheapest frontier model API as of March 2026.

Gemini 3.1 Pro	GPT-5.2	Claude Sonnet 4.6
Input	$2/1M	$5/1M	$3/1M
Output	$12/1M	$15/1M	$15/1M
Context	1M	400K	1M (beta)
Cost per 100K in + 20K out	$0.44	$0.80	$0.60

At scale, Gemini 3.1 Pro costs roughly 45% less than GPT-5.2 and 27% less than Sonnet 4.6 per session.

Free Tier

Google AI Studio offers a free tier:

60 requests per minute

1M tokens per minute

No credit card required

This is the most generous free API tier among the three major providers.

Key Features

1M Token Context Window

Gemini 3.1 Pro natively supports 1 million tokens of context — enough for:

~700,000 words of text

~30,000 lines of code

~1 hour of video

~11 hours of audio

Unlike competing models that offer extended context as a beta feature, Gemini's 1M context is fully GA and priced the same as standard context.

Custom Tools (Function Calling)

Custom tools let you define external functions that Gemini can call during generation. The model decides when to call a tool, structures the arguments, and incorporates the result into its response.

This is what the gemini-3.1-pro-preview-customtools variant is optimized for.

Grounding with Google Search

Gemini can ground its responses in real-time Google Search results. Enable grounding to reduce hallucinations and ensure the model uses current information.

Native Multimodal

Process text, images, audio, and video in a single request. No separate vision or audio models — Gemini handles all modalities natively.

Code Example: Custom Tools / Function Calling

This example creates a weather tool that Gemini can call to get current conditions.

Python

python

from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_API_KEY")

# Define the tool
weather_tool = types.Tool(
    function_declarations=[
        types.FunctionDeclaration(
            name="get_weather",
            description="Get the current weather for a city",
            parameters=types.Schema(
                type=types.Type.OBJECT,
                properties={
                    "city": types.Schema(
                        type=types.Type.STRING,
                        description="City name, e.g. 'San Francisco'"
                    ),
                    "unit": types.Schema(
                        type=types.Type.STRING,
                        enum=["celsius", "fahrenheit"],
                        description="Temperature unit"
                    ),
                },
                required=["city"],
            ),
        )
    ]
)

# Send request with tool
response = client.models.generate_content(
    model="gemini-3.1-pro-preview-customtools",
    contents="What's the weather like in Tokyo?",
    config=types.GenerateContentConfig(
        tools=[weather_tool],
    ),
)

# Check if the model wants to call a function
for part in response.candidates[0].content.parts:
    if part.function_call:
        print(f"Function: {part.function_call.name}")
        print(f"Arguments: {part.function_call.args}")
        # Output:
        # Function: get_weather
        # Arguments: {'city': 'Tokyo', 'unit': 'celsius'}

        # In production, you'd call your actual weather API here,
        # then send the result back to Gemini for a natural language response.

JavaScript

javascript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });

const weatherTool = {
  functionDeclarations: [
    {
      name: "get_weather",
      description: "Get the current weather for a city",
      parameters: {
        type: "OBJECT",
        properties: {
          city: {
            type: "STRING",
            description: "City name, e.g. 'San Francisco'",
          },
          unit: {
            type: "STRING",
            enum: ["celsius", "fahrenheit"],
            description: "Temperature unit",
          },
        },
        required: ["city"],
      },
    },
  ],
};

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro-preview-customtools",
  contents: "What's the weather like in Tokyo?",
  config: {
    tools: [weatherTool],
  },
});

// Check for function calls in the response
for (const part of response.candidates[0].content.parts) {
  if (part.functionCall) {
    console.log(`Function: ${part.functionCall.name}`);
    console.log(`Arguments:`, part.functionCall.args);
  }
}

Code Example: Multimodal (Image + Text)

Python

python

from google import genai
from google.genai import types
import base64

client = genai.Client(api_key="YOUR_API_KEY")

# Read a local image
with open("screenshot.png", "rb") as f:
    image_data = f.read()

response = client.models.generate_content(
    model="gemini-3.1-pro",
    contents=[
        types.Content(
            parts=[
                types.Part(text="What's in this screenshot? Describe the UI elements."),
                types.Part(
                    inline_data=types.Blob(
                        mime_type="image/png",
                        data=image_data,
                    )
                ),
            ]
        )
    ],
)

print(response.text)

JavaScript

javascript

import { GoogleGenAI } from "@google/genai";
import fs from "fs";

const ai = new GoogleGenAI({ apiKey: "YOUR_API_KEY" });

const imageData = fs.readFileSync("screenshot.png");
const base64Image = imageData.toString("base64");

const response = await ai.models.generateContent({
  model: "gemini-3.1-pro",
  contents: [
    {
      parts: [
        { text: "What's in this screenshot? Describe the UI elements." },
        {
          inlineData: {
            mimeType: "image/png",
            data: base64Image,
          },
        },
      ],
    },
  ],
});

console.log(response.text);

API Comparison: Gemini 3.1 Pro vs GPT-5.2 vs Claude Sonnet 4.6

Feature	Gemini 3.1 Pro	GPT-5.2	Claude Sonnet 4.6
Input price	$2/1M	$5/1M	$3/1M
Output price	$12/1M	$15/1M	$15/1M
Context window	1M (GA)	400K	1M (beta)
Function calling	Yes (custom tools)	Yes	Yes (tool use)
Multimodal	Text + image + audio + video	Text + image + audio	Text + image
Grounding	Google Search	Web browsing	No native grounding
Streaming	Yes	Yes	Yes
Batch API	Yes	Yes	Yes
Free tier	60 RPM, 1M TPM	Limited	Limited
SDK languages	Python, JS, Go, Dart, Swift	Python, JS	Python, JS
Coding (SWE-bench)	76.8%	80.0%	79.6%
Computer use	N/A	38.2%	72.5%
Math (AIME)	~88%	100%	~90%

When to Choose Each API

Choose Gemini 3.1 Pro when:

Cost is a primary concern (cheapest frontier API)
You need native video or audio processing
You need 1M context in production (GA, not beta)
You want Google Search grounding
You're building on Google Cloud

Choose GPT-5.2 when:

Math-heavy reasoning is critical
You're in the OpenAI ecosystem
You need structured outputs with guaranteed JSON schemas
Speed on simple queries matters most

Choose Claude Sonnet 4.6 when:

Coding and agentic tasks are the primary use case
You need computer use / browser automation
Office productivity tasks (documents, spreadsheets)
Prompt injection resistance matters (agent safety)

Integrating Gemini 3.1 Pro with Your App

Using with Y Build

If you're building a product with Y Build, you can integrate the Gemini API directly into your backend. Y Build projects deploy to Cloudflare Workers, which can call the Gemini API with low latency.

javascript

// In a Y Build project (Cloudflare Worker)
export async function onRequest(context) {
  const response = await fetch(
    "https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-pro:generateContent",
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "x-goog-api-key": context.env.GEMINI_API_KEY,
      },
      body: JSON.stringify({
        contents: [{ parts: [{ text: "Your prompt here" }] }],
      }),
    }
  );

  const data = await response.json();
  return new Response(JSON.stringify(data));
}

Rate Limits

Tier	Requests/min	Tokens/min
Free	60	1,000,000
Pay-as-you-go	1,000	4,000,000
Enterprise	Custom	Custom

Frequently Asked Questions

What is gemini-3.1-pro-preview-customtools?

Is Gemini 3.1 Pro better than GPT-5.2?

How does Gemini 3.1 Pro compare to Claude Sonnet 4.6?

Can I use Gemini 3.1 Pro for free?

What's the difference between Google AI Studio and Vertex AI?

The Bottom Line

Sources:

Early Access

Be first to build with AI

Y Build is the AI-era operating system for startups. Join the waitlist and get early access.