Activefastllm Proprietary

Gemini 2.5 Flash-Lite

by Google· Released May 2025

Gemini 2.5 Flash-Lite is a cost-efficient, low-latency model in the Gemini 2.5 family, designed for high-volume, simple tasks. It offers the lowest cost and fastest speed among Gemini 2.5 models while maintaining a 1M token context window. It is ideal for use cases that require quick responses and minimal processing.

Official Site API Docs

Input cost

$0.075 per 1M tokens

Output cost

$0.30 per 1M tokens

Context window

1M tokens

Max output

8192 tokens

Modalities

text

License

proprietary

Capabilities

Function CallingStreamingJSON ModeCode GenerationText GenerationMultilingual Support

Best For

High-volume, simple tasks requiring low latency and low cost, such as classification, extraction, and basic chat.

Strengths

Lowest cost among Gemini 2.5 models
Fastest inference speed
1M token context window
Good for simple, repetitive tasks

Limitations

No multimodal capabilities (text-only)
Lower quality on complex reasoning compared to Flash and Pro
No vision or audio support
Not suitable for creative or nuanced tasks

Use Cases

Text classification

Data extraction

Simple chatbots

Content summarization

Sentiment analysis

Keyword extraction

Basic code generation

Improvements Over Previous Model

New model in the Gemini 2.5 Flash-Lite family, no direct predecessor
Offers the lowest cost and fastest speed in the Gemini 2.5 lineup
1M token context window matches larger models
Designed for high-volume, simple tasks

Back to all models