Model Details

Domain:

Task:

Model Access:

Open weights (restricted use)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

Introducing the latest additions to our Stable LM 2 language model series: a 12 billion parameter base model and an instruction-tuned variant, trained on 2 trillion tokens in seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch. This medium-sized model balances strong performance, efficiency, memory requirements, and speed, following our established Stable LM 2 1.6B framework as detailed in our previously released technical report. With this release, we’re extending our model range, offering a transparent and powerful tool for developers to innovate in AI language technology. Soon, we plan to introduce a long-context variant of these models which will be available on Hugging Face upon release. From Hugging Face: Stable LM 2 12B is a 12.1 billion parameter decoder-only language model pre-trained on 2 trillion tokens of diverse multilingual and code datasets for two epochs.

Benchmarking

FLOPs2.91e+23

Notes: 2* 12143605760 params * 3* 2T tokens * 2 epochs = 2.91e23. Trained on 384 H100s (AWS P5 instances).

Training

Training Code AccessibilityRequires Stability AI Membership. Free for non-commercial use, $20/month for commercial use if less than $1M in annual revenue, $1M in institutional funding, and 1M monthly active users. Apache 2.0 license for repo, which includes detailed hyperparams and training details: https://github.com/Stability-AI/StableLM/blob/main/LICENSE

HardwareNVIDIA H100 SXM5 80GB

Size Notes: 2T tokens