by OpenAI· Released September 2022
Whisper is a general-purpose speech recognition model by OpenAI, capable of transcribing speech in multiple languages and translating it into English. It is trained on a large dataset of diverse audio and is robust to accents, background noise, and technical jargon. Whisper is available as an open-source model and also via OpenAI's API.
Input cost
$0.006 per minute (audio input)
Output cost
—
Context window
—
Max output
—
Modalities
Parameters
1.5B (large-v3)
License
MIT
Accurate speech transcription and translation across many languages, especially in noisy environments.