Model Details

Domain:

Language

Task:

Language modeling

Chat

Model Access:

Open weights (unrestricted)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

Benchmarking

FLOPs2.2e+23

Notes: "It took us 65 days to train the model on a pool of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources."

Training

Training Code AccessibilityApache 2.0 for weights. training details, but no code: https://medium.com/yandex/yandex-publishes-yalm-100b-its-the-largest-gpt-like-neural-network-in-open-source-d1df53d0e9a6

HardwareNVIDIA A100

Hardware Quantity800

Size Notes: 1.7TB of data 300B tokens – from github https://github.com/yandex/YaLM-100B I've assumed that 1 token correspond to 1 word in russian language.

Parameters

Parameters100000000000

Notes: 100B

Authors

Mikhail Khrushchev, Ruslan Vasilev, Alexey Petrov, Nikolay Zinov

Related ModelsView all models

YandexGPT 5 LiteBy Yandex

Language

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters

Authors

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters

Authors