In this technical report, we present TeleChat, a collection of large language models (LLMs) with parameters of 3 billion, 7 billion and 12 billion. It includes pretrained language models as well as fine-tuned chat models that is aligned with human preferences. TeleChat is initially pretrained on an extensive corpus containing a diverse collection of texts from both English and Chinese languages, including trillions of tokens. Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe. We evaluate the performance of TeleChat on various tasks, including language understanding, mathematics, reasoning, code generation, and knowledge-based question answering. Our findings indicate that TeleChat achieves comparable performance to other open-source models of similar size across a wide range of public benchmarks. To support future research and applications utilizing LLMs, we release the fine-tuned model checkpoints of TeleChat's 7B and 12B variant, along with code and a portion of our pretraining data, to the public community.
FLOPs4.2e+22
Notes: 80 nodes, each having 8 Nvidia A100 Sxm 40GB GPUs 6 FLOP / token / parameter * 7*10^9 parameters * 1 * 10^12 tokens = 4.2e+22 FLOP
Training Code AccessibilityApache 2.0 weights: https://huggingface.co/Tele-AI/telechat-7B inference code: https://github.com/Tele-AI/Telechat "Community use TeleChat model needs to follow 《 TeleChat model community license agreement 》. The TeleChat model supports commercial use. If you plan to use the TeleChat model or its derivatives for commercial purposes, you need to contact the mailbox below tele_ai@chinatelecom.cn , Submit the application materials required by the 《TeleChat model community license agreement》. After the review is approved, you will be granted a non-exclusive, global, non-transferable, non-relicable, revocable commercial copyright license."
HardwareNVIDIA A100 SXM4 40 GB
Hardware Quantity640
Size Notes: Table 3
Parameters7000000000