Falcon 180B is a super-powerful language model with 180 billion parameters, trained on 3.5 trillion tokens. It's currently at the top of the Hugging Face Leaderboard for pre-trained Open Large Language Models and is available for both research and commercial use. This model performs exceptionally well in various tasks like reasoning, coding, proficiency, and knowledge tests, even beating competitors like Meta's LLaMA 2. Among closed source models, it ranks just behind OpenAI's GPT 4, and performs on par with Google's PaLM 2 Large, which powers Bard, despite being half the size of the model.
Notes: 43,500 petaflop-days per Table 1 of the paper 43500 * 1e15 * 24 * 3600 = 3.76e24 C = 6ND = 6 FLOP/token/parameter * 3.5 trillion tokens * 180 billion parameters = 3.78*10^24 FLOP
Size Notes: 3.5 trillion tokens * (~3 words per 4 tokens) ~= 2.625 trillion words
Notes: "Falcon 180B is a super-powerful language model with 180 billion parameters"