by Alibaba· Released August 2023· Cutoff June 2023
Qwen-Audio is a large audio-language model developed by Alibaba Cloud, designed to process and understand various types of audio inputs including speech, music, and environmental sounds. It extends the Qwen series by incorporating audio understanding capabilities, enabling tasks such as audio captioning, sound event detection, and speech recognition. The model is part of Alibaba's open-source Qwen family, offering a unified framework for audio and text interactions.
Input cost
Free (open source)
Output cost
Free (open source)
Context window
8192 tokens
Max output
2048 tokens
Modalities
Parameters
7B
License
Apache-2.0
Audio understanding and captioning tasks, including speech, music, and environmental sound analysis.