Today, we are excited to announce that with our latest Turing universal language representation model (T-ULRv5), a Microsoft-created model is once again the state of the art and at the top of the Google XTREME public leaderboard. Resulting from a collaboration between the Microsoft Turing team and Microsoft Research, the 2.2 billion-parameter T-ULRv5 XL outperforms the current 2nd best model by an average score of 1.7 points. It is also the state of the art across each of the four subcategories of tasks on the leaderboard. These results demonstrate the strong capabilities of T-ULRv5, which, in addition to being more capable, trains 100 times faster than its predecessors.
FLOPs2.9e+22
Notes: 312000000000000 FLOP / GPU / sec [A100] * 256 GPUs * 336 hours * 3600 sec / hour * 0.3 [assumed utilization] = 2.8983951e+22 FLOP
Training Code Accessibility"Microsoft Turing models are also available for custom application building through our private preview program" "If you are a researcher who would like to work with us in assessing and improving Turing models, Microsoft Turing Academic Program (MS-TAP) allows you to submit a proposal and get access to these models in greater detail."
HardwareNVIDIA A100
Hardware Quantity256
Parameters2200000000
Notes: 2.2B