AiPortalX
Search...
⌘
K
Log in
Filters
Selected Filters
Character recognition ocr
Task
1
Domain
Organization
Country
Include Other Tiers
Active Research
Legacy Models
By default, only production models are shown
Toggle Sidebar
21 Models found
Claude Opus 4.5
By
Anthropic
Domain
Language
Multimodal
Vision
Task
Code generation
Language modeling
Language generation
Quantitative reasoning
+13 more
GLM-4.5-Air
By
Zhipu AI
Domain
Language
Task
Language modeling
Language generation
Question answering
Visual question answering
+4 more
Aeneas
By
Google DeepMind
Domain
Vision
Multimodal
Language
Task
Character recognition OCR
Visual question answering
Grok 4
By
xAI
Domain
Language
Multimodal
Vision
Task
Language modeling
Language generation
Question answering
Search
+4 more
Gemini 2.5 Flash-Lite Jun 2024
By
Google DeepMind
Domain
Language
Vision
Video
Speech
+1 more
Task
Language modeling
Language generation
Question answering
Translation
+9 more
Claude Opus 4
By
Anthropic
Domain
Language
Multimodal
Vision
Task
Code generation
Language modeling
Language generation
Quantitative reasoning
+13 more
Claude Sonnet 4
By
Anthropic
Domain
Language
Multimodal
Vision
Task
Code generation
Language modeling
Language generation
Quantitative reasoning
+13 more
Gemma 3n
By
Google
Domain
Language
Multimodal
Speech
Vision
Task
Language modeling
Language generation
Question answering
Chat
+7 more
Reka Flash 3
By
Reka AI
Domain
Multimodal
Language
Vision
Video
+1 more
Task
Chat
Code generation
Language modeling
Language generation
+6 more
Mistral OCR
By
Mistral AI
Domain
Multimodal
Vision
Language
Task
Character recognition OCR
Chat
Language generation
NVILA 15B
By
NVIDIA
Domain
Vision
Language
Multimodal
Video
Task
Visual question answering
Video description
Language modeling
Language generation
+2 more
Aria
By
Rhymes AI
Domain
Multimodal
Language
Video
Vision
Task
Language modeling
Language generation
Visual question answering
Image captioning
+4 more
Qwen2-VL-2B
By
Alibaba
Domain
Language
Vision
Multimodal
Task
Visual question answering
Video description
Language modeling
Language generation
+4 more
Qwen2-VL-72B
By
Alibaba
Domain
Language
Vision
Multimodal
Task
Visual question answering
Video description
Language modeling
Language generation
+4 more
Qwen2-VL-7B
By
Alibaba
Domain
Language
Vision
Multimodal
Task
Visual question answering
Video description
Language modeling
Language generation
+4 more
Cambrian-1-13B
By
New York University NYU
Domain
Multimodal
Vision
Language
Task
Image captioning
Visual question answering
Character recognition OCR
Load More