>
Notes: C = 6ND = 6 * 40B * 1000B = 2.4e+23 FLOP (assuming one epoch) Table 1 from https://arxiv.org/pdf/2311.16867 Falcon paper 2,800 petaflop-days * 1e15 * 24 * 3600 = 2.4192e+23 FLOPs
Size Notes: 1000B tokens ~= 750B words
Notes: Model comes in 7B and 40B variants.