AIchitect
StacksGraphBuilderSimulateCompareGenomeActivityPulse
AIchitect/Explore/Multimodal

Multimodal

10 tools
View in Explore graph →Open Builder →
OSSFreePopular

LLaVA

Open-source multimodal LLM assistant

⭐ 22,000
OSSFreePopular

Moondream

Tiny OSS vision language model

⭐ 11,000
OSSFree

Qwen-VL

Alibaba's open-weight vision-language model line (Qwen2.5-VL → Qwen3-VL)

⭐ 15,000
Free

Fal.ai

Fast serverless inference API for image, video, and audio models

⭐ 10,000
OSSFree

InternVL2

Top OSS multimodal model from OpenGVLab

⭐ 7,800
OSSFree

PaliGemma

Google's OSS vision-language model

⭐ 3,200

Pixtral

Mistral's vision-language model — folded into Mistral Small 4 (2026)

Free

Runway Gen-4.5

Frontier video generation with character/scene consistency

Free

Google Veo 3.1

Google's frontier video generation model with native audio

Free

Kling 3.0

Frontier video generation from Kuaishou

Other categories

Coding Assistants (24)Autonomous Agents (11)Agent Frameworks (17)Pipelines & RAG (12)LLM Infrastructure (43)Design & UI (11)DevOps & CI/CD (6)Documentation (9)Product & PM (6)MCP Servers (15)Prompt & Eval (9)Specifications (10)Spec-Driven Dev (3)AI Code Review (3)Fine-tuning (6)Voice AI (10)Browser Automation (7)Observability (14)Memory & Persistence (4)AI Guardrails (5)