AI Tools Directory

104 tools found in multimodal

ID

Idefics

Multimodal AI · Multimodal
FREE

Open-source image and text understanding model.

AR

Aria

Multimodal AI · Multimodal
FREE

Open multimodal native model for text and images.

Idefics3

Multimodal AI · Multimodal
FREE

Open multimodal with document understanding.

FU

Fuyu

Multimodal AI · Multimodal
FREE

Multimodal model designed for digital agents.

MA

Magika

Multimodal AI · Multimodal
FREE

Google's AI for accurate file type detection.

EM

Emu2

Multimodal AI · Multimodal
FREE

Meta's multimodal model for visual generation.

LLaVA 1.6

Multimodal AI · Multimodal
FREE

Improved open-source vision with better OCR.

BLIP3 xGen

Multimodal AI · Multimodal
FREE

Salesforce vision model with video support.

Show-O

Multimodal AI · Multimodal
FREE

Unified understanding and generation model.

Ovis 2

Multimodal AI · Multimodal
FREE

High-performance open multimodal model.

PH

Phi-3 Vision

Multimodal AI · Multimodal
FREE

Microsoft's small but powerful vision model.

SmolVLM

Multimodal AI · Multimodal
FREE

Tiny 2B vision model for phones and edge.

Microsoft Phi-3.5 Vision

Multimodal AI · Multimodal
FREE

Lightweight Microsoft vision for edge.

Grok Vision

Multimodal AI · Multimodal
FREEMIUM

Grok image analysis with real-time X data.

Argilla AI

Multimodal AI · Multimodal
FREE

Open-source NLP annotation for fine-tuning.

Google Vision API

Multimodal AI · Multimodal
FREEMIUM

Google Cloud AI for image analysis.

OWL-ViT

Multimodal AI · Multimodal
FREE

Google zero-shot detection from text queries.

Sightengine

Multimodal AI · Multimodal
FREEMIUM

AI safety detection for images and video.

LL

Llava

Multimodal AI · Multimodal
FREE

Open-source multimodal vision and language model.

AY

Aya by Cohere

Multimodal AI · Multimodal
FREE

Open-source AI supporting 101 languages.

QW

Qwen VL

Multimodal AI · Multimodal
FREE

Alibaba's vision-language model for complex tasks.

CO

CogVLM

Multimodal AI · Multimodal
FREE

Deep visual-language integration open-source model.

MI

Mistral Large

Multimodal AI · Multimodal
PAID

Mistral's flagship 123B parameter model.

NO

Nous Hermes

Multimodal AI · Multimodal
FREE

Fine-tuned open-source model for instruction tasks.

← Prev 1 2 3 4 5 Next → Page 3/5