by DeepSeek· Released December 2024· Cutoff May 2024
DeepSeek V3 is a powerful Mixture-of-Experts (MoE) language model with 671B total parameters, activated 37B per token. It achieves top-tier performance on benchmarks like MMLU and HumanEval, rivaling leading closed-source models. It is designed for efficient inference and supports a 128K context window.
Input cost
$0.27 per 1M tokens
Output cost
$1.10 per 1M tokens
Context window
128K tokens
Max output
8192 tokens
Modalities
Parameters
671B (37B activated per token)
License
proprietary
High-performance text generation, coding, and reasoning tasks requiring a large context window and cost efficiency.