Google

Gemini 2.5 Pro API

Gemini 2.5 Pro is Google DeepMind's flagship — strongest at long-context tasks thanks to its 1 million token window. Multimodal (text, image, video, audio), with strong reasoning and competitive pricing.

Input
$1.25
/ 1M tokens
Output
$10.00
/ 1M tokens
Context
1M
tokens
Vision
Yes

Top use cases

  • Long-document Q&A (entire codebases, books, transcripts)
  • Video and audio understanding
  • Multi-document research
  • Long agentic conversations without context loss

Use Gemini 2.5 Pro in 30 seconds

ModelServer is OpenAI-compatible. Point your existing OpenAI SDK at modelserver.dev/v1 and set the model name to gemini-2.5-pro.

gemini-2-5-pro.py
from openai import OpenAI

client = OpenAI(
    api_key="sk-modelserver-...",
    base_url="https://modelserver.dev/v1",
)

response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {"role": "user", "content": "Hello, Gemini 2.5 Pro!"}
    ],
)

print(response.choices[0].message.content)

Frequently asked questions

How much does Gemini 2.5 Pro cost?
Gemini 2.5 Pro is priced at $1.25 per 1M input tokens and $10.00 per 1M output tokens via ModelServer. ModelServer adds a flat 5.5% platform fee on top — no markups on individual tokens, no monthly minimum.
What is the Gemini 2.5 Pro context window?
Gemini 2.5 Pro supports a 1M token context window. You can put roughly 750,000 words in a single prompt.
Is Gemini 2.5 Pro OpenAI-compatible via ModelServer?
Yes. Point your OpenAI SDK base_url to https://modelserver.dev/v1 and set model="gemini-2.5-pro". Existing OpenAI-SDK code works without modification.
Who is Gemini 2.5 Pro best for?
Anything that needs to fit > 200K tokens in a single call.
Does Gemini 2.5 Pro support vision input?
Yes. Gemini 2.5 Pro accepts image inputs alongside text. Pass images as base64 or URL in the OpenAI-compatible message format.
Gemini 2.5 Pro API — Pricing & Access — ModelServer