>
In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
FLOPs1.09e+23
Notes: 7b * 2.6t * 6 = 1.092e23 Also mentions 1,024 NVIDIA A800 GPUs at 180 TFLOPS per GPU
Training Code Accessibilityhttps://huggingface.co/baichuan-inc/Baichuan2-7B-Base license here: https://github.com/baichuan-inc/Baichuan2?tab=readme-ov-file Baichuan 2 模型社区许可协议 (Community License Agreement) restrictions on commercial applications with many DAUs and particular types of businesses Apache 2.0 for code
HardwareNVIDIA A800 PCIe 40 GB
Hardware Quantity1024
Parameters7000000000