Ling-1T is the first flagship non-thinking model in the Ling 2.0 series, featuring 1 trillion total parameters with ≈ 50 billion active parameters per token. Built on the Ling 2.0 architecture, Ling-1T is designed to push the limits of efficient reasoning and scalable cognition. Pre-trained on 20 trillion+ high-quality, reasoning-dense tokens, Ling-1T-base supports up to 128K context length and adopts an evolutionary chain-of-thought (Evo-CoT) process across mid-training and post-training. This curriculum greatly enhances the model’s efficiency and reasoning depth, allowing Ling-1T to achieve state-of-the-art performance on multiple complex reasoning benchmarks—balancing accuracy and efficiency.
Notes: 6 FLOP/parameter/token * 50000000000 active parameters * 20000000000000 tokens = 6e+24 FLOP
Size Notes: "Pre-trained on 20 trillion+ high-quality, reasoning-dense tokens"
Notes: 1 trillion total parameters with 50 billion activated parameters