Model Details

Domain:

Task:

Quantitative reasoning

Code generation

Visual question answering

Video description

Speech recognition ASR

Model Access:

API access

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

Description: Gemini 2.0 Flash-Lite is a member of the Gemini 2.0 series of models, a suite of highly-capable, natively multimodal models designed to power a new era of agentic systems. Gemini 2.0 Flash-Lite is Google’s most cost-efficient model, striking a balance between efficiency and quality targeting low-cost workflows. Inputs: Text strings (e.g., a question, a prompt, a document(s) to be summarized), images, audio, and video files, with a 1,048,576 token context window. Outputs: Text, with an 8,192 token output. Architecture: The Gemini 2.0 series builds upon the sparse Mixture-of-Experts (MoE) Transformer architecture (Clark et al., 2020; Fedus et al., 2021; Lepikhin et al., 2020; Riquelme et al., 2021; Shazeer et al., 2017; Zoph et al., 2022) used in Gemini 1.5. Key enhancements in Gemini 2.0 include refined architectural design and novel optimization methods, leading to substantial improvements in training stability and computational efficiency. Each model within the 2.0 family, including Gemini 2.0 Flash-Lite, is carefully designed and calibrated to achieve an optimal balance between quality and performance for their specific downstream applications.