by Google· Released May 2025
Gemini 2.5 Flash-Lite is a cost-efficient, low-latency model in the Gemini 2.5 family, designed for high-volume, simple tasks. It offers the lowest cost and fastest speed among Gemini 2.5 models while maintaining a 1M token context window. It is ideal for use cases that require quick responses and minimal processing.
Input cost
$0.075 per 1M tokens
Output cost
$0.30 per 1M tokens
Context window
1M tokens
Max output
8192 tokens
Modalities
License
proprietary
High-volume, simple tasks requiring low latency and low cost, such as classification, extraction, and basic chat.