Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.
FLOPs168000000000000000000
Notes: 312000000000000 FLOP / GPU / sec * 500 GPU - hours * 3600 sec / hour * 0.3 [assumed utilization] = 1.6848e+20 FLOP
Training Code AccessibilityApache 2.0 for weights https://huggingface.co/hexgrad/Kokoro-82M Apache 2.0 for inference code https://github.com/hexgrad/kokoro
HardwareNVIDIA A100 SXM4 80 GB
Size Notes: "<100 hrs" of training audio data
Parameters82000000
Notes: 82M