>
We introduce Falcon3, a family of decoder-only large language models under 10 billion parameters, developed by Technology Innovation Institute (TII) in Abu Dhabi. By pushing the boundaries of performance and training efficiency, this release reflects our ongoing commitment to advancing open and accessible large foundation models. Falcon3 represents a natural evolution from previous releases, emphasizing expanding the models' science, math, and code capabilities.
Notes: 6 FLOP / parameter / token * 7 * 10^9 parameters * 14 * 10^12 tokens = 5.88e+23 FLOP
Size Notes: "Pretrained on 14 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 1024 H100 GPU chips"