Model Details

Domain:

Task:

Mathematical reasoning

Retrieval-augmented generation

Model Access:

Open weights (restricted use)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

Introducing a new generation of Yandex generative text models. They handle answers better. On a stream that combines user questions and complex tasks in demand in the business sphere, YandexGPT 5 Pro outperforms a similar model of the previous generation in 67% of cases. In some types of tasks - for example, in writing and summarizing texts - the new model is not inferior to GPT-4o by OpenAI and other world leaders. The fifth generation has two models with a context length of 32 thousand tokens: the more powerful Pro and the lightweight Lite.

Benchmarking

FLOPs7.35e+23

Notes: 6 FLOP / token / parameter * 8 * 10^9 parameters * 15.32 * 10^12 tokens [see dataset size notes] = 7.3536e+23 FLOP

Training

Training Code AccessibilityCustom lisense - requires contract agreement for users with more than 10 million output tokens per month https://huggingface.co/yandex/YandexGPT-5-Lite-8B-pretrain

Size Notes: "At the first stage, the model was trained mainly on Russian-language and English-language texts with a total volume of 15T tokens with a context length of up to 8k tokens." "In the second stage, which we called Powerup, the model was trained on high-quality data of 320B tokens."

Parameters

Parameters8000000000

Notes: 8B

Related ModelsView all models

YaLMBy Yandex

Language

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters