Model Details

Domain:

Task:

Model Access:

Open weights (unrestricted)

Citations:

874

Introduction

Model Description GPT-Neo 2.7B is a transformer model designed using EleutherAI's replication of the GPT-3 architecture. GPT-Neo refers to the class of models, while 2.7B represents the number of parameters of this particular pre-trained model. Training data GPT-Neo 2.7B was trained on the Pile, a large scale curated dataset created by EleutherAI for the purpose of training this model. Training procedure This model was trained for 420 billion tokens over 400,000 steps. It was trained as a masked autoregressive language model, using cross-entropy loss.

Benchmarking

FLOPs

7.9e+21

Notes: source: https://www.aitracker.org/ 6 FLOP / token / parameter * 2.7 * 10^9 parameters * 420000000000 tokens [see dataset size notes] = 6.804e+21 FLOP

Training

Training Code Accessibility

MIT for weights and code training code: https://github.com/EleutherAI/gpt-neo#training-guide https://huggingface.co/EleutherAI/gpt-neo-2.7B

Size Notes: "In aggregate, the Pile consists of over 825GiB of raw text data" (see GPT-NeoX) "This model was trained for 420 billion tokens over 400,000 steps. It was trained as a masked autoregressive language model, using cross-entropy loss." https://huggingface.co/EleutherAI/gpt-neo-2.7B

Parameters

2700000000

Notes: source: https://www.eleuther.ai/projects/gpt-neo/ Note: Directory of LLMs (https://docs.google.com/spreadsheets/d/1gc6yse74XCwBx028HV_cvdxwXkmXejVjkO-Mz2uwE0k/edit#gid=0) gives a somewhat lower estimate (2e9)

Authors

Sid Black, Leo Gao, Phil Wang, Connor Leahy, Stella Biderman

Related Models

EleutherAI | GPT-Neo-2.7B , Capabilities, Benchmarks and Use Cases, 2026

GPT-Neo-2.7B - Use Model

GPT-Neo-2.7B - Use Model

Model Details

Introduction

Benchmarking

Training

Parameters

Authors

Related Models

StarCoder

Pythia-1.4b

Pythia-12b

Pythia-160m

GPT-Neo-2.7B - Use Model

GPT-Neo-2.7B - Use Model

Model Details

Introduction

Benchmarking

Training

Parameters

Authors

Related Models

StarCoder

Pythia-1.4b

Pythia-12b

Pythia-160m