Now we’re officially releasing Gemma 2 to researchers and developers globally. Available in both 9 billion (9B) and 27 billion (27B) parameter sizes, Gemma 2 is higher-performing and more efficient at inference than the first generation, with significant safety advancements built in. In fact, at 27B, it offers competitive alternatives to models more than twice its size, delivering the kind of performance that was only possible with proprietary models as recently as December. And that’s now achievable on a single NVIDIA H100 Tensor Core GPU or TPU host, significantly reducing deployment costs.
FLOPs2.11e+24
Notes: "For the 27B model, we train on an 8x24x32 configuration of TPUv5p, totaling 6144 chips" trained on 13T tokens 6ND = 6*27000000000*13000000000000=2.106e+24
Training Code AccessibilityGemma 2 is available under our commercially-friendly Gemma license, giving developers and researchers the ability to share and commercialize their innovations.
HardwareGoogle TPU v5p
Hardware Quantity6144
Size Notes: "We train Gemma 2 27B on 13 trillion tokens of primarily-English data"
Parameters27000000000