Falcon 180B is a super-powerful language model with 180 billion parameters, trained on 3.5 trillion tokens. It's currently at the top of the Hugging Face Leaderboard for pre-trained Open Large Language Models and is available for both research and commercial use. This model performs exceptionally well in various tasks like reasoning, coding, proficiency, and knowledge tests, even beating competitors like Meta's LLaMA 2. Among closed source models, it ranks just behind OpenAI's GPT 4, and performs on par with Google's PaLM 2 Large, which powers Bard, despite being half the size of the model.
FLOPs3.76e+24
Notes: 43,500 petaflop-days per Table 1 of the paper 43500 * 1e15 * 24 * 3600 = 3.76e24 C = 6ND = 6 FLOP/token/parameter * 3.5 trillion tokens * 180 billion parameters = 3.78*10^24 FLOP
Training Code Accessibility"Falcon 180b can be commercially used but under very restrictive conditions, excluding any "hosting use"." https://huggingface.co/blog/falcon-180b
HardwareNVIDIA A100 SXM4 40 GB
Hardware Quantity4096
Size Notes: 3.5 trillion tokens * (~3 words per 4 tokens) ~= 2.625 trillion words
Parameters180000000000
Notes: "Falcon 180B is a super-powerful language model with 180 billion parameters"