by DeepSeek· Released January 2025· Cutoff December 2024
DeepSeek R1 is a reasoning-focused large language model that excels in complex problem-solving, mathematics, and coding tasks. It uses a Mixture-of-Experts architecture with 671B total parameters (37B activated) and features a 128K token context window. The model is open-source under the MIT license and offers competitive pricing.
Input cost
$0.55 per 1M tokens (cache hit), $2.19 per 1M tokens (cache miss)
Output cost
$8.00 per 1M tokens (cache hit), $8.00 per 1M tokens (cache miss)
Context window
128K tokens
Max output
—
Modalities
Parameters
671B (37B activated)
License
MIT
Complex reasoning tasks, advanced mathematics, and code generation requiring deep logical analysis.