Model Details

Domain:

Task:

Model Access:

API access

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

Benchmarking

FLOPs4.8e+24

Notes: trained using NVIDIA NeMo: https://blogs.nvidia.com/blog/nemo-amazon-titan/ 13,760 NVIDIA A100 chips (using 1,720 P4d nodes). It took 48 days to train. from https://importai.substack.com/p/import-ai-365-wmd-benchmark-amazon counting operations: 6*200000000000*4000000000000=4.8e+24 gpu usage: 312000000000000(FLOP/s)*0.3*13760*1152*3600=5.3413281792e+24

Training

HardwareNVIDIA A100

Hardware Quantity13760

Size Notes: 4T tokens of data, based on comments from amazon engineer James Hamilton at a 2024 talk: https://perspectives.mvdirona.com/2024/01/cidr-2024/ Also cited here: https://lifearchitect.ai/titan/

Parameters

Parameters200000000000

Notes: 200B dense model https://importai.substack.com/p/import-ai-365-wmd-benchmark-amazon

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters