In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closedsource models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
Notes: 13 billion parameters * 2 trillion tokens * 6 FLOP / token / parameter = 1.6e23 FLOP
Size Notes: 2 trillion tokens ~= 1.5 trillion words
Notes: Llama has been released in 7B, 13B, and 70B variants.