Notes: It uses a mixed expert model (MoE, Mixture-of-experts) architecture. The total parameter scale of the model is 25.8 billion, and the actual number of activated parameters is 4.2 billion.