Introducing a new generation of Yandex generative text models. They handle answers better. On a stream that combines user questions and complex tasks in demand in the business sphere, YandexGPT 5 Pro outperforms a similar model of the previous generation in 67% of cases. In some types of tasks - for example, in writing and summarizing texts - the new model is not inferior to GPT-4o by OpenAI and other world leaders. The fifth generation has two models with a context length of 32 thousand tokens: the more powerful Pro and the lightweight Lite.
Notes: 6 FLOP / token / parameter * 8 * 10^9 parameters * 15.32 * 10^12 tokens [see dataset size notes] = 7.3536e+23 FLOP
Size Notes: "At the first stage, the model was trained mainly on Russian-language and English-language texts with a total volume of 15T tokens with a context length of up to 8k tokens." "In the second stage, which we called Powerup, the model was trained on high-quality data of 320B tokens."
Notes: 8B