AiPortalX
Search...
⌘
K
Log in
Filters
Selected Filters
Speech recognition asr
Task
1
Domain
Organization
Country
Include Other Tiers
Active Research
Legacy Models
By default, only production models are shown
Toggle Sidebar
30 Models found
Gemini Robotics-ER 1.5
By
Google DeepMind
Domain
Vision
Language
Speech
Task
Instruction interpretation
Robotic manipulation
Image captioning
Object detection
+5 more
Qwen3-Omni-30B-A3B
By
Alibaba
Domain
Multimodal
Language
Vision
Speech
+1 more
Task
Language modeling
Language generation
Question answering
Visual question answering
+6 more
gpt-realtime
By
OpenAI
Domain
Speech
Vision
Language
Task
Speech recognition ASR
Speech synthesis
Visual question answering
Speech-to-speech
+1 more
Canary 1B v2
By
NVIDIA
Domain
Speech
Task
Speech recognition ASR
Translation
Speech-to-text
Parakeet-tdt-0.6b-v3
By
NVIDIA
Domain
Speech
Task
Speech-to-text
Speech recognition ASR
Gemini 2.5 Deep Think
By
Google
Domain
Language
Multimodal
Vision
Video
+2 more
Task
Language modeling
Language generation
Mathematical reasoning
Code generation
+6 more
Gemini 2.5 Flash-Lite Jun 2024
By
Google DeepMind
Domain
Language
Vision
Video
Speech
+1 more
Task
Language modeling
Language generation
Question answering
Translation
+9 more
Gemma 3n
By
Google
Domain
Language
Multimodal
Speech
Vision
Task
Language modeling
Language generation
Question answering
Chat
+7 more
Gemini 2.5 Flash
By
Google DeepMind
Domain
Language
Multimodal
Vision
Speech
+1 more
Task
Language modeling
Language generation
Question answering
Code generation
+9 more
Gemini 2.5 Pro
By
Google DeepMind
Domain
Language
Vision
Video
Multimodal
+1 more
Task
Language modeling
Language generation
Question answering
Code generation
+6 more
Chirp 3 Speech-to-Text
By
Google
Domain
Speech
Task
Speech recognition ASR
Speech-to-text
Translation
Reka Flash 3
By
Reka AI
Domain
Multimodal
Language
Vision
Video
+1 more
Task
Chat
Code generation
Language modeling
Language generation
+6 more
Baichuan-Omni-1.5
By
Baichuan
Domain
Multimodal
Language
Speech
Vision
+2 more
Task
Language modeling
Language generation
Question answering
Audio question answering
+8 more
Gemini 2.0 Flash
By
Google DeepMind
Domain
Language
Vision
Audio
Speech
+2 more
Task
Language modeling
Language generation
Question answering
Visual question answering
+9 more
Gemini 2.0 Pro
By
Google DeepMind
Domain
Language
Multimodal
Vision
Video
+1 more
Task
Code generation
Language modeling
Language generation
Question answering
+3 more
Chirp 2 Speech-to-Text
By
Google
Domain
Speech
Task
Speech recognition ASR
Speech-to-text
Translation
Load More