OpenAI

GPT-4o mini API

GPT-4o mini is the small, cheap sibling of GPT-4o. ~16× cheaper on input, very fast, and still good enough for most routine tasks. The economic default for production OpenAI apps.

Input
$0.15
/ 1M tokens
Output
$0.60
/ 1M tokens
Context
128K
tokens
Vision
Yes

Top use cases

  • High-volume chat and Q&A
  • Embeddings post-processing and reranking
  • Cheap classification and extraction
  • Lightweight agents

Use GPT-4o mini in 30 seconds

ModelServer is OpenAI-compatible. Point your existing OpenAI SDK at modelserver.dev/v1 and set the model name to gpt-4o-mini.

gpt-4o-mini.py
from openai import OpenAI

client = OpenAI(
    api_key="sk-modelserver-...",
    base_url="https://modelserver.dev/v1",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Hello, GPT-4o mini!"}
    ],
)

print(response.choices[0].message.content)

Frequently asked questions

How much does GPT-4o mini cost?
GPT-4o mini is priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens via ModelServer. ModelServer adds a flat 5.5% platform fee on top — no markups on individual tokens, no monthly minimum.
What is the GPT-4o mini context window?
GPT-4o mini supports a 128K token context window. You can put roughly 96,000 words in a single prompt.
Is GPT-4o mini OpenAI-compatible via ModelServer?
Yes. Point your OpenAI SDK base_url to https://modelserver.dev/v1 and set model="gpt-4o-mini". Existing OpenAI-SDK code works without modification.
Who is GPT-4o mini best for?
High-throughput OpenAI workloads on a budget.
Does GPT-4o mini support vision input?
Yes. GPT-4o mini accepts image inputs alongside text. Pass images as base64 or URL in the OpenAI-compatible message format.

Other OpenAI models

GPT-4o mini API — Pricing & Access — ModelServer