Model Details

Domain:

Speech

Task:

Text-to-speech TTS

Speech synthesis

Model Access:

Open weights (unrestricted)

Introduction

Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.

Benchmarking

FLOPs

168000000000000000000

Notes: 312000000000000 FLOP / GPU / sec * 500 GPU - hours * 3600 sec / hour * 0.3 [assumed utilization] = 1.6848e+20 FLOP

Training

Training Code Accessibility

Apache 2.0 for weights https://huggingface.co/hexgrad/Kokoro-82M Apache 2.0 for inference code https://github.com/hexgrad/kokoro

Hardware

NVIDIA A100 SXM4 80 GB

Size Notes: "<100 hrs" of training audio data

Parameters

82000000

Notes: 82M

Related Models

Kokoro v1.0

By hexgrad

Speech

Kokoro v0.19 - Use Model

Kokoro v0.19 - Use Model

Model Details

Introduction

Benchmarking

Training

Parameters

Related Models

Kokoro v1.0

Kokoro v0.19 - Use Model

Kokoro v0.19 - Use Model

Model Details

Introduction

Benchmarking

Training

Parameters

Related Models

Kokoro v1.0