StableLM-Base-Alpha-7B-v2 is a 7 billion parameter decoder-only language model pre-trained on diverse English datasets. This model is the successor to the first StableLM-Base-Alpha-7B model, addressing previous shortcomings through the use of improved data sources and mixture ratios.
Notes: "StableLM-Base-Alpha-7B-v2 is pre-trained using a multi-stage context length extension schedule following similar work (Nijkamp et al. 2023); first pre-training at a context length of 2048 for 1 trillion tokens, then fine-tuning at a context length of 4096 for another 100B tokens" 6890209280 params * 1.1 trillion tokens * 6 = 4.5e22 alternatively: "StableLM-Base-Alpha-7B-v2 was trained on the Stability AI cluster - occupying 384 NVIDIA A100 40GB GPUs across AWS P4d instances. Training took approximately 16.33 days to complete across both stages." 312 teraflops * 384 * 16.33 * 24 * 3600 * 0.3 = 5.07e22
Size Notes: 1 trillion tokens