AI Observatory / Model Radar OpenAI / openai/gpt-audio

OpenAI: GPT Audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...

Back To Model Radar Return To Index

128,000 Context

16,384 Max output

$2.50 / 1M Prompt price

2026-01-19 Created

01 / Snapshot

Pricing, context, modalities, and parameters.

Model Radar detail pages stay neutral and operator-readable: core metadata first, then workflow fit.

Provider	OpenAI	Input modalities	text, audio
Output modalities	text, audio	Prompt price	$2.50 / 1M
Completion price	$10.00 / 1M	Request price	N/A
Context length	128,000	Max completion tokens	16,384
Supported parameters	frequency_penalty, logit_bias, logprobs, max_tokens, presence_penalty, response_format, seed, stop, structured_outputs, temperature, tool_choice, tools, top_logprobs, top_p

Best for openai/gpt-audio

OpenAI: GPT Audio

Coding workflows Cross-modal work Voice and audio new tool-capable

02 / Related

Related models in nearby categories.

Related models are derived from overlapping use-case categories so the detail page stays navigable.

Openrouter <$0.01 / 1M

Auto Router

"Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...

Cross-modal work Image generation Voice and audio High-volume usage

~Google $0.50 / 1M

Google Gemini Flash Latest

This model always redirects to the latest model in the Google Gemini Flash family.

Cross-modal work Voice and audio Long context High-volume usage

~Google $2.00 / 1M

Google Gemini Pro Latest

This model always redirects to the latest model in the Google Gemini Pro family.

Cross-modal work Voice and audio Long context Coding workflows

Google $0.10 / 1M

Google: Gemini 2.0 Flash

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5). It...

Cross-modal work Voice and audio Long context High-volume usage

Google $0.07 / 1M

Google: Gemini 2.0 Flash Lite

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5),...

Cross-modal work Voice and audio Long context High-volume usage

Google $0.30 / 1M

Google: Gemini 2.5 Flash

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

Deep analysis Coding workflows Cross-modal work Voice and audio

03 / Colophon

Routes and exits.

Each model page stays simple: overview, compare, related models, then back to the public hub.