AiPortalXAIPortalX Logo

Filters

Selected Filters

Speech Recognition
Task1
Organization
Country

Include Other Tiers

By default, only production models are shown

30 Models found

Google DeepMind

Gemini Robotics-ER 1.5

By Google DeepMind
Domain
VisionVisionLanguageLanguageSpeechSpeech
Task
Instruction interpretationInstruction interpretationRobotic manipulationRobotic manipulationImage captioningImage captioning+5 more
Alibaba

Qwen3-Omni-30B-A3B

By Alibaba
Domain
MultimodalMultimodalLanguageLanguageVisionVision+1 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+6 more
OpenAI

gpt-realtime

By OpenAI
Domain
SpeechSpeechVisionVisionLanguageLanguage
Task
Speech recognition ASRSpeech recognition ASRSpeech synthesisSpeech synthesisVisual question answeringVisual question answering+1 more
NVIDIA

Canary 1B v2

By NVIDIA
Domain
SpeechSpeech
Task
Speech recognition ASRSpeech recognition ASRTranslationTranslationSpeech-to-textSpeech-to-text
NVIDIA

Parakeet-tdt-0.6b-v3

By NVIDIA
Domain
SpeechSpeech
Task
Speech-to-textSpeech-to-textSpeech recognition ASRSpeech recognition ASR
Google

Gemini 2.5 Deep Think

By Google
Domain
LanguageLanguageMultimodalMultimodalVisionVision+2 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationMathematical reasoningMathematical reasoning+6 more
Google DeepMind

Gemini 2.5 Flash-Lite Jun 2024

By Google DeepMind
Domain
LanguageLanguageVisionVisionVideoVideo+1 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+9 more
Google

Gemma 3n

By Google
Domain
LanguageLanguageMultimodalMultimodalSpeechSpeech
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+7 more
Google DeepMind

Gemini 2.5 Flash

By Google DeepMind
Domain
LanguageLanguageMultimodalMultimodalVisionVision+1 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+9 more
Google DeepMind

Gemini 2.5 Pro

By Google DeepMind
Domain
LanguageLanguageVisionVisionVideoVideo+1 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+6 more
Google

Chirp 3 Speech-to-Text

By Google
Domain
SpeechSpeech
Task
Speech recognition ASRSpeech recognition ASRSpeech-to-textSpeech-to-textTranslationTranslation
Reka AI

Reka Flash 3

By Reka AI
Domain
MultimodalMultimodalLanguageLanguageVisionVision+1 more
Task
ChatChatCode generationCode generationLanguage modelingLanguage modeling+6 more
Baichuan

Baichuan-Omni-1.5

By Baichuan
Domain
MultimodalMultimodalLanguageLanguageSpeechSpeech+2 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+8 more
Google DeepMind

Gemini 2.0 Flash

By Google DeepMind
Domain
LanguageLanguageVisionVisionAudioAudio+2 more
Task
Language modelingLanguage modelingLanguage generationLanguage generationQuestion answeringQuestion answering+9 more
Google DeepMind

Gemini 2.0 Pro

By Google DeepMind
Domain
LanguageLanguageMultimodalMultimodalVisionVision+1 more
Task
Code generationCode generationLanguage modelingLanguage modelingLanguage generationLanguage generation+3 more