OpenAI
GPT-4o mini API
GPT-4o mini is the small, cheap sibling of GPT-4o. ~16× cheaper on input, very fast, and still good enough for most routine tasks. The economic default for production OpenAI apps.
- Input
- $0.15
- Output
- $0.60
- Context
- 128K
- Vision
- Yes
/ 1M tokens
/ 1M tokens
tokens
Top use cases
- High-volume chat and Q&A
- Embeddings post-processing and reranking
- Cheap classification and extraction
- Lightweight agents
Use GPT-4o mini in 30 seconds
ModelServer is OpenAI-compatible. Point your existing OpenAI SDK at modelserver.dev/v1 and set the model name to gpt-4o-mini.
gpt-4o-mini.py
from openai import OpenAI
client = OpenAI(
api_key="sk-modelserver-...",
base_url="https://modelserver.dev/v1",
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Hello, GPT-4o mini!"}
],
)
print(response.choices[0].message.content)Frequently asked questions
- How much does GPT-4o mini cost?
- GPT-4o mini is priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens via ModelServer. ModelServer adds a flat 5.5% platform fee on top — no markups on individual tokens, no monthly minimum.
- What is the GPT-4o mini context window?
- GPT-4o mini supports a 128K token context window. You can put roughly 96,000 words in a single prompt.
- Is GPT-4o mini OpenAI-compatible via ModelServer?
- Yes. Point your OpenAI SDK base_url to https://modelserver.dev/v1 and set model="gpt-4o-mini". Existing OpenAI-SDK code works without modification.
- Who is GPT-4o mini best for?
- High-throughput OpenAI workloads on a budget.
- Does GPT-4o mini support vision input?
- Yes. GPT-4o mini accepts image inputs alongside text. Pass images as base64 or URL in the OpenAI-compatible message format.