Description Pleias-pico-350m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference. It includes the following features, that would apply to any responsibly trained variant: Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content. Extensive multilingual support for main European languages. A new tokenizer designed for enhanced document processing tasks and better multilingual support. Extremely low level of toxicity and problematic content. Pleias-pico-350m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese. Given its size, Pleias-pico-350m-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
Notes: 6 FLOP / parameter / token * 350 * 10^6 parameters * 1,086,324,736,000 tokens = 2.2812819e+21 FLOP 989400000000000 FLOP / GPU / sec [bf16 assumed] * 46 HOURS * 3600 sec / hour * 64 GPUs * 0.3 [assumed utilization] = 3.1458171e+21 FLOP sqrt(2.2812819e+21*3.1458171e+21) = 2.6788982e+21
Size Notes: "Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens)."
Notes: 350M