OSSFreePopular

HuggingFace

Open ML model hub and inference platform

135,000
OSSFreePopular

Ollama

Run LLMs locally via simple CLI/API

90,000
OSSFreePopular

llama.cpp

C++ LLM inference for local and edge deployment

68,000
OSSFreePopular

LiteLLM

Universal LLM proxy — 100+ models, one API

16,000
FreePopular

OpenRouter

Unified API routing to 200+ LLMs

FreePopular

Pinecone

Managed vector DB service

Popular

Anthropic API

Claude models API by Anthropic

Popular

OpenAI API

GPT-5 era models, embeddings, and Responses API from OpenAI

FreePopular

Google Gemini API

Google's frontier multimodal model API — Gemini Pro and Flash

FreePopular

Amazon Bedrock

AWS managed AI service with access to frontier models from every provider

Popular

Azure OpenAI

OpenAI models hosted on Azure with enterprise compliance and SLAs

OSSFree

Ray

Distributed computing framework for ML workloads

33,000
OSSFree

vLLM

High-throughput LLM serving with PagedAttention

32,000
OSSFree

Milvus

Distributed vector database built for scale

31,000
OSSFree

Redis Vector

In-memory vector search built into Redis — no separate DB needed

23,000
OSSFree

Qdrant

High-performance vector DB with filtering

20,000
OSSFree

Chroma

Lightweight embedded vector DB for AI apps

15,000
OSSFree

pgvector

PostgreSQL extension for vector similarity search

13,000
OSSFree

Vercel AI SDK

TypeScript SDK for streaming AI UIs

12,000
OSSFree

Weaviate

Cloud-native vector search engine

11,000
OSSFree

LanceDB

Serverless vector DB built on Apache Arrow — embedded or cloud

5,800

RunPod

Serverless GPU cloud for AI inference and training

1,200
Free

Unify

Route prompts to the best model dynamically by cost, speed, or quality

800
Free

PortKey

AI gateway with routing, fallbacks, and caching

Free

Groq

Ultra-fast LLM inference via LPU hardware

Free

Together AI

Fast inference API for open-source models

Free

Fireworks AI

Fast inference with function calling and fine-tuning

Free

Modal

Cloud platform for GPU inference and training

Free

Mistral API

Mistral Large, Mistral Small 4 (unified multimodal), and open-weight families

Free

Cohere API

Command and Embed models for enterprise NLP

Free

Replicate

Run open-source ML models via API

Free

Martian

Intelligent model router that picks the right LLM for every request

Free

Not Diamond

AI model router that learns which LLM performs best for your tasks

Free

Cerebras

Wafer-scale chip inference — the fastest LLM API available

Free

DeepInfra

Serverless GPU inference for open-source LLMs at low cost

Baseten

Deploy any ML model as a low-latency production API

Lambda Labs

GPU cloud and API for training and serving AI models

Perplexity API

LLM API with real-time web search grounding built in

Free

turbopuffer

Serverless vector database built for scale — no infrastructure to manage

Free

xAI Grok API

xAI's Grok models with real-time knowledge and strong reasoning

Free

DeepSeek API

High-performance frontier model API at a fraction of the cost

Free

SambaNova Cloud

Fastest LLM inference API — 200+ tokens/sec on Llama 405B

Free

Cloudflare AI Gateway

LLM gateway with caching, analytics, and rate limiting

Other categories