Introducing the latest additions to our Stable LM 2 language model series: a 12 billion parameter base model and an instruction-tuned variant, trained on 2 trillion tokens in seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch. This medium-sized model balances strong performance, efficiency, memory requirements, and speed, following our established Stable LM 2 1.6B framework as detailed in our previously released technical report. With this release, we’re extending our model range, offering a transparent and powerful tool for developers to innovate in AI language technology. Soon, we plan to introduce a long-context variant of these models which will be available on Hugging Face upon release. From Hugging Face: Stable LM 2 12B is a 12.1 billion parameter decoder-only language model pre-trained on 2 trillion tokens of diverse multilingual and code datasets for two epochs.
Notes: 2* 12143605760 params * 3* 2T tokens * 2 epochs = 2.91e23. Trained on 384 H100s (AWS P5 instances).
Size Notes: 2T tokens
Notes: Precise number given in HF model card