>
Finally, GLM-Z1-9B-0414 is a surprise. We employed all the aforementioned techniques to train a small model (9B). GLM-Z1-9B-0414 exhibits excellent capabilities in mathematical reasoning and general tasks. Its overall performance is top-ranked among all open-source models of the same size. Especially in resource-constrained scenarios, this model achieves an excellent balance between efficiency and effectiveness, providing a powerful option for users seeking lightweight deployment.
Notes: Assuming it was trained on the same 15T dataset as 32B model: 6 FLOP / parameter / token * 9 * 10^9 parameters * 15 * 10^12 tokens = 8.1e+23 FLOP "Likely" confidence due to the uncertain dataset size
Notes: 9B