Activefastmultimodal Proprietary

Gemini 2.0 Flash Lite

by Google· Released February 2025· Cutoff January 2025

Gemini 2.0 Flash Lite is a cost-efficient, low-latency model optimized for high-volume, text-only tasks. It offers a 1M token context window and is designed to be the most affordable option in the Gemini 2.0 Flash family, ideal for scaling AI applications.

Official Site API Docs

Input cost

$0.075 per 1M tokens

Output cost

$0.30 per 1M tokens

Context window

1M tokens

Max output

8192 tokens

Modalities

text

License

proprietary

Capabilities

Text GenerationFunction CallingCode GenerationStreamingJSON ModeSystem Instructions

Best For

High-volume, text-only applications requiring low cost and low latency.

Strengths

Lowest cost in Gemini 2.0 Flash family
Fast inference speed
Large 1M token context window
Good for simple text tasks

Limitations

No multimodal support (text-only)
Lower quality than Gemini 2.0 Flash and Pro
Not suitable for complex reasoning or creative tasks
Limited to text input/output

Use Cases

Customer support chatbots

Content classification

Data extraction

Simple summarization

Text-based automation

High-throughput labeling

Cost-sensitive AI deployments

Improvements Over Previous Model

New model in Gemini 2.0 Flash family, not an update to a previous Lite model
Lower pricing than Gemini 2.0 Flash ($0.075 vs $0.10 input per 1M tokens)
Same 1M token context window as Gemini 2.0 Flash
Text-only modality reduces cost and latency compared to multimodal Flash

Back to all models