CPM-Bee is a fully open-source, commercially-usable Chinese-English bilingual base model with a capacity of ten billion parameters. It is the second milestone achieved through the training process of CPM-live. Utilizing the Transformer auto-regressive architecture, CPM-Bee has been pre-trained on an extensive corpus of trillion-scale tokens, thereby possessing remarkable foundational capabilities.
Notes: 6 FLOP / parameter / token * 10*10e9 parameters * 1.002e11 tokens [see training dataset size notes] = 6.012e22 FLOP
Size Notes: Planned to use 600GB of clean data. 600GB * 167M tokens/GB = 1.002e+11 tokens https://github.com/OpenBMB/CPM-Live/blob/master/plans/CPM-Bee%E8%AE%AD%E7%BB%83%E8%AE%A1%E5%88%92%E4%B9%A6.md
Notes: "CPM-Bee 10B large model training will launch on October 13, 2022, with monthly model releases."