by Alibaba· Released May 2025· Cutoff April 2025
Qwen3 4B is a compact, efficient language model from Alibaba's Qwen3 series, designed for fast inference and deployment on resource-constrained devices. It balances strong performance with low computational cost, making it suitable for edge computing and real-time applications.
Input cost
Free (open source)
Output cost
Free (open source)
Context window
32K tokens
Max output
8192 tokens
Modalities
Parameters
4B
License
Apache-2.0
Lightweight, fast inference tasks on edge devices or in latency-sensitive applications.