Xingchen Super Multi-Dialect Speech Recognition Model v1.0 is pre-trained by 300,000 hours of unlabeled multi-dialect speech data, and uses 30 internal labeled data for fine-tuning, breaking the dilemma that a single model can only recognize a specific single dialect. It can Support understanding of 30 dialects including Cantonese, Shanghainese, Sichuan dialect, and Wenzhou dialect
Training Code AccessibilityApache 2.0 https://huggingface.co/Tele-AI/TeleSpeech-ASR1.0 no license is mentioned https://github.com/Tele-AI/TeleSpeech-ASR
Size Notes: "Xingchen Super Multi-Dialect Speech Recognition Model v1.0 is pre-trained by 300,000 hours of unlabeled multi-dialect speech data"
Parameters300000000