⚠ This tool appears inactive — no commits in 90+ days. Consider an alternative.
MultimodalOpen Source✦ Free Tier

LLaVA

Open-source multimodal LLM assistant

22,000 stars● Health 55/100 — Slowing· commit recency (40 pts) · star momentum (30 pts) · issue ratio (20 pts) · forks (10 pts)App Infrastructure

About

Large Language and Vision Assistant — connects a vision encoder to an LLM for instruction-following with images. OSS research model widely used as a multimodal base. Runs via Ollama.

Choose LLaVA when…

  • You want an open-source multimodal model for self-hosted deployment
  • You're doing research on vision-language instruction following
  • You need a well-documented baseline for multimodal tasks

Builder Slot

How does your AI see and understand images?Optional for most stacks

Vision-language models for image understanding, captioning, visual QA, and document parsing

Dev Tools
Not applicable
App Infra
Optional
Hybrid
Optional

Other tools in this slot:

Integrates with (1)

OllamaLLM Infrastructure

Ollama bundles LLaVA as a local model, exposing it via an OpenAI-compatible REST endpoint.

Run LLaVA vision tasks offline with a single command — no GPU cloud account required.

Compare →

Alternatives to consider (2)

Pricing

✦ Free tier available

Recent Activity

View all activity for this tool →

Badge

Add to your GitHub README

LLaVA on AIchitect[![LLaVA](https://www.aichitect.dev/badge/tool/llava)](https://www.aichitect.dev/tool/llava)

Explore the full AI landscape

See how LLaVA fits into the bigger picture — browse all 207 tools and their relationships.

Explore graph →