>
Falcon Mamba is a new model by Technology Innovation Institute (TII) in Abu Dhabi released under the TII Falcon Mamba 7B License 1.0. The model is open access and available within the Hugging Face ecosystem here for anyone to use for their research or application purposes.
Notes: 989400000000000 FLOP / GPU / sec [bf16 assumed] * 1440 hours [see training time notes] * 3600 sec / hour * 256 GPUs * 0.3 [assumed utilization] = 3.9391101e+23 FLOP
Size Notes: "Falcon Mamba was trained with ~ 5500GT of data, mainly composed of RefinedWeb data with addition of high-quality technical data and code data from public sources." "Batch size 2048" "Sequence length 8192 During the last training stages"