by OpenAI· Released May 2024· Cutoff October 2023
GPT-4o ('omni') is OpenAI's flagship multimodal model that accepts text, image, and audio inputs and produces text, image, and audio outputs. It matches GPT-4 Turbo performance on English text and code while being significantly faster and 50% cheaper in API pricing. GPT-4o achieves state-of-the-art results on vision and multilingual benchmarks, and offers improved reasoning over non-English languages.
Input cost
$5.00 per 1M tokens
Output cost
$15.00 per 1M tokens
Context window
128K tokens
Max output
4096 tokens
Modalities
License
proprietary
Real-time multimodal applications requiring fast, cost-effective reasoning across text, images, and audio.