This model is commonly used behind the scenes in AI tools.
Introduction
Parameters
Parameters4200000000
Notes: It uses a mixed expert model (MoE, Mixture-of-experts) architecture. The total parameter scale of the model is 25.8 billion, and the actual number of activated parameters is 4.2 billion.
XVERSE Technology,Shenzhen Yuanxiang Technology | XVERSE-MoE-A4.2B - Capabilities, Benchmarks and Use Cases