Model Details

Domain:

Task:

Retrieval-augmented generation

Code generation

Model Access:

Open weights (unrestricted)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

We’re launching Granite 4, the next generation of IBM language models. Granite 4.0 features a new hybrid Mamba/transformer architecture that greatly reduces memory requirements without sacrificing performance. They can be run on significantly cheaper GPUs and at significantly reduced costs compared to conventional LLMs. These new Granite 4.0 offerings, open sourced under a standard Apache 2.0 license, are the world’s first open models to receive ISO 42001 certification and are cryptographically signed, confirming their adherence to internationally recognized best practices for security, governance and transparency. Granite 4.0 models are available on IBM watsonx.ai, as well as through platform partners including (in alphabetical order) Dell Technologies on Dell Pro AI Studio and Dell Enterprise Hub, Docker Hub, Hugging Face, Kaggle, LM Studio, NVIDIA NIM, Ollama, OPAQUE and Replicate. Access through Amazon SageMaker JumpStart and Microsoft Azure AI Foundry is coming soon.

Benchmarking

FLOPs1.22e+24

Notes: 6 FLOP/parameter/token * 9000000000 active parameters * 22500000000000 tokens = 1.215e+24 FLOP

Training

Training Code AccessibilityApache 2.0 https://huggingface.co/ibm-granite/granite-4.0-h-small

HardwareNVIDIA GB200

Size Notes: "all Granite 4.0 models are trained on samples drawn from the same carefully compiled 22T-token corpus of enterprise-focused training data, as well the same improved pre-training methodologies, post-training regimen and chat template." 4 training stages: 15T+5T+2T+0.5T = 22.5T