LLM InfrastructureOpen Source✦ Free Tier

LiteLLM

Universal LLM proxy — 100+ models, one API

16,000 stars● Health 75ActiveDev Productivity & App Infrastructure

About

OSS proxy that normalizes 100+ LLMs to the OpenAI format. Add routing, fallbacks, caching, and cost tracking in one layer.

Choose LiteLLM when…

  • You want a unified API across 100+ LLM providers
  • You're switching between providers or running A/B tests
  • You need fallbacks and load balancing across models

Builder Slot

Which models does your stack route through?Optional for most stacks

A gateway that normalizes calls across providers — one API for all models, with fallbacks

Dev Tools
Not applicable
App Infra
Optional
Hybrid
Optional

Other tools in this slot:

Stack Genome Detection

AIchitect's Genome scanner detects LiteLLM in your project via these signals:

npm packages
litellm
pip packages
litellm
env vars
LITELLM_API_KEYLITELLM_BASE_URL
config files
litellm_config.yaml

Integrates with (27)

ContinueCoding Assistants

Continue points to LiteLLM's OpenAI-compatible proxy endpoint, routing all completions and chat through any provider LiteLLM supports.

Model flexibility in Continue without changing editor config — swap providers at the LiteLLM level.

Compare →
AiderCoding Assistants

Aider accepts LiteLLM's OpenAI-compatible proxy as its model backend via the --openai-api-base flag.

Model-agnostic Aider sessions — route to Claude, GPT-4o, Gemini, or local models through one LiteLLM config.

Compare →
OpenHandsAutonomous Agents

OpenHands routes all LLM calls through LiteLLM, making it model-agnostic — swap Claude, GPT-4o, or a local model via config.

Model flexibility for autonomous agents without changing OpenHands code — route to the best model per task type.

Compare →
PlandexCoding Assistants

Plandex accepts LiteLLM-proxied endpoints as its model backend, routing its multi-step planning to any provider.

Model-agnostic long-horizon coding plans — use Claude for reasoning-heavy planning, cheaper models for mechanical steps.

Compare →
CrewAIAgent Frameworks

CrewAI routes LLM calls through LiteLLM, making any model accessible to any agent in the crew via a unified API.

Mix models across agents in the same crew — one agent on Claude, another on GPT-4o — through one LiteLLM config change.

Compare →
AutoGenAgent Frameworks

AutoGen accepts LiteLLM-proxied endpoints as model backends via its model_client configuration.

Provider-flexible AutoGen agents — route different agent roles to different models through one LiteLLM config.

Compare →
LangGraphAgent Frameworks

LangGraph nodes call LiteLLM-proxied endpoints, routing each node's LLM calls to any provider without changing graph code.

Model-agnostic LangGraph agents — route different nodes to Claude, GPT-4o, or local models via one LiteLLM config.

Compare →
Semantic KernelAgent Frameworks

Semantic Kernel accepts LiteLLM's OpenAI-compatible endpoint as a custom connector configuration.

Model-agnostic Semantic Kernel agents — swap providers at the LiteLLM layer without changing .NET or Python agent code.

Compare →
LangChainPipelines & RAG

LangChain accepts LiteLLM's OpenAI-compatible endpoint as a drop-in model connector, routing all LLM calls through the proxy.

Provider-agnostic LangChain chains — swap between Claude, GPT-4o, and open models by changing one LiteLLM config line.

Compare →
LlamaIndexPipelines & RAG

LlamaIndex accepts LiteLLM's OpenAI-compatible proxy as its LLM backend, routing all generation through any provider.

Model-agnostic LlamaIndex pipelines — swap the generation model without touching retrieval or indexing code.

Compare →
DSPyPipelines & RAG

DSPy routes its compilation and inference calls through LiteLLM, enabling multi-provider optimization runs.

Model-agnostic DSPy compilation — optimize prompts across Claude, GPT-4o, and open models in one run to find the best-performing variant.

Compare →
DifyPipelines & RAG

Dify accepts LiteLLM's OpenAI-compatible endpoint as a custom model provider in its model settings.

Every model LiteLLM supports becomes available in Dify's no-code workflow builder without touching the workflow itself.

Compare →
AgnoAgent Frameworks

Agno accepts LiteLLM-proxied endpoints as model backends, giving its agents access to any provider through the gateway.

Model flexibility across Agno agents — route different agent roles to different models through one LiteLLM config.

Compare →
LangfuseLLM Infrastructure

LiteLLM sends callback events to Langfuse after every LLM call — one config line captures cost, model, tokens, and latency per request.

Per-request observability across every provider in your stack without changing application code.

Compare →
HeliconeLLM Infrastructure

LiteLLM can route calls through Helicone as a proxy layer or log directly to Helicone's API after each call.

Request replay, caching, and Helicone's rate-limit features layered on top of LiteLLM's provider routing.

Compare →
OllamaLLM Infrastructure

LiteLLM recognises Ollama's local API and includes its models in the unified provider list alongside cloud providers.

Local Ollama models treated identically to cloud providers — route between local and cloud by changing one parameter.

Compare →
GroqLLM Infrastructure

LiteLLM routes to Groq's API using its provider prefix, normalising Groq's interface into the unified format.

Ultra-fast inference on latency-sensitive paths — route to Groq for speed, other providers for quality, via one config.

Compare →
Together AILLM Infrastructure

LiteLLM routes to Together AI's inference API, including its open-source model catalogue.

Open-source model access at scale via LiteLLM — route cost-sensitive paths to Together AI without changing application code.

Compare →
vLLMLLM Infrastructure

LiteLLM connects to a self-hosted vLLM endpoint via its OpenAI-compatible API, treating it as any other provider.

Self-hosted GPU inference via vLLM accessible through the same LiteLLM interface as cloud providers — one config for everything.

Compare →
Anthropic APILLM Infrastructure

LiteLLM wraps Anthropic's API with its provider prefix, normalising Claude's API into OpenAI-compatible format.

Claude accessible via the same interface as every other provider — swap with a config change, no code modifications.

Compare →
OpenAI APILLM Infrastructure

LiteLLM routes to OpenAI's API natively, treating it as the default provider in its unified format.

OpenAI access through LiteLLM's multi-provider interface — add fallbacks, cost controls, and model swapping without touching app code.

Compare →
Mistral APILLM Infrastructure

LiteLLM routes to Mistral's API using its provider prefix, normalising Mistral's interface into the standard format.

Cost-effective European model access through the same LiteLLM config as Claude and GPT-4o.

Compare →
Cohere APILLM Infrastructure

LiteLLM routes to Cohere's API for both generation and embedding models.

Cohere's strong embedding models accessible alongside generation models through one LiteLLM config — useful for RAG pipelines.

Compare →
Fireworks AILLM Infrastructure

LiteLLM routes to Fireworks AI's fast open-source inference API using its provider prefix.

High-throughput, low-latency open-source inference via Fireworks AI routed through the same LiteLLM interface as other providers.

Compare →
Vercel AI SDKLLM Infrastructure

The Vercel AI SDK can point to LiteLLM's OpenAI-compatible endpoint as a custom provider, routing all SDK calls through LiteLLM.

Provider-agnostic Vercel AI SDK apps — swap between Claude, GPT-4o, and open models at the LiteLLM layer without changing SDK code.

Compare →
Not DiamondLLM Infrastructure

Not Diamond can be layered over LiteLLM, adding intelligent task-based routing on top of LiteLLM's unified provider API.

Best-of-both: LiteLLM's provider coverage with Not Diamond's quality-driven routing.

Compare →

Often paired with (2)

Alternatives to consider (3)

Pricing

✦ Free tier available
EnterpriseCustom

In 5 stacks

Ruled out by 1 stack

Evaluation & Quality Stack
Provider routing; this stack assumes you're already settled on providers and need to evaluate them

Badge

Add to your GitHub README

LiteLLM on AIchitect[![LiteLLM](https://aichitect.dev/badge/tool/litellm)](https://aichitect.dev/tool/litellm)

Explore the full AI landscape

See how LiteLLM fits into the bigger picture — browse all 207 tools and their relationships.

Explore graph →