by Google· Released October 2024· Cutoff September 2024
Gemini 1.5 Flash-8B is a smaller, faster, and more cost-efficient variant of the Gemini 1.5 Flash model, optimized for high-volume, low-latency tasks. It retains multimodal capabilities (text, image, audio, video) and a 1M token context window, making it ideal for applications requiring quick responses and reduced computational cost.
Input cost
$0.0375 per 1M tokens
Output cost
$0.15 per 1M tokens
Context window
1M tokens
Max output
8192 tokens
Modalities
Parameters
8B
License
proprietary
High-volume, latency-sensitive applications requiring multimodal understanding at low cost.