Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints, which are ready for inference and available for commercial use.
Training Code AccessibilityMIT License https://huggingface.co/suno/bark "This model is meant for research purposes only. The model output is not censored and the authors do not endorse the opinions in the generated content. Use at your own risk."
Parameters300000000