OSS LLM engineering platform
Open-source platform for tracing, evaluations, and prompt management. Self-hostable alternative to LangSmith with clean UX.
Traces every LLM call, eval, and cost so you know exactly what your stack is doing
Other tools in this slot:
AIchitect's Genome scanner detects Langfuse in your project via these signals:
langfuselangfuseLANGFUSE_SECRET_KEYLANGFUSE_PUBLIC_KEYLANGFUSE_HOSTCrewAI exports OpenTelemetry traces that Langfuse ingests, capturing every agent step, tool call, and LLM invocation across the crew.
→ Full multi-agent observability — which agent did what, at what cost, in what order, with what output.
AutoGen emits OpenTelemetry traces that Langfuse captures, recording every agent message, tool call, and LLM interaction.
→ Conversation-level observability across multi-agent AutoGen runs — trace which agent said what and at what token cost.
LangGraph integrates with Langfuse via its callback system or OpenTelemetry, capturing every node execution as a nested trace span.
→ Full execution traces of complex agent graphs — cost per node, latency per step, and LLM call details in one view.
Langfuse provides a LangChain callback handler that captures every chain, LLM call, and tool invocation as a nested trace.
→ Full execution traces for any LangChain application — cost, latency, and prompt quality in one view.
Langfuse provides a LlamaIndex callback handler that traces every query, retrieval call, and LLM generation within the pipeline.
→ Retrieval-level observability: see which chunks were fetched, at what similarity score, and what the LLM did with them.
Dify exports traces via its observability integration to Langfuse, capturing every LLM call and tool invocation in its workflows.
→ Observability on top of Dify's no-code AI apps — trace costs and latency even without writing pipeline code.
Mastra integrates with Langfuse via OpenTelemetry, tracing every agent step and LLM call automatically.
→ Out-of-the-box observability for Mastra agents — cost, latency, and full trace quality without custom instrumentation.
Agno sends traces to Langfuse via its built-in OpenTelemetry integration.
→ Full observability on Agno agent runs — multi-step traces with per-step cost and latency breakdown.
LiteLLM sends callback events to Langfuse after every LLM call — one config line captures cost, model, tokens, and latency per request.
→ Per-request observability across every provider in your stack without changing application code.
Portkey's gateway logs metadata to Langfuse via webhook integration, enriching Langfuse traces with gateway-level cost and caching data.
→ Combined gateway analytics and LLM trace quality in one view — Portkey's proxy layer meets Langfuse's evaluation depth.
Ragas uploads evaluation scores directly to Langfuse as trace scores, linking eval results to the specific traces they evaluated.
→ Eval results pinned to the exact traces that generated them — jump from a poor metric score directly to the failing trace.
DeepEval sends evaluation results to Langfuse as trace scores via its Langfuse integration.
→ Quality metrics — faithfulness, hallucination rate, G-Eval scores — visible alongside the raw traces that produced them.
Langfuse's SDK wraps OpenAI's client, capturing every API call with token counts, cost, and latency automatically.
→ Per-call observability on OpenAI usage — see exactly which prompts are expensive, slow, or producing poor outputs.
Langfuse production traces can be exported as eval datasets that Promptfoo uses for regression testing in CI.
→ Close the eval loop: real failures captured in Langfuse become the regression test cases Promptfoo runs on every deploy.
Langfuse traces are exported as datasets to Braintrust, where they become versioned experiment inputs for systematic eval tracking.
→ Production traces feed directly into structured experiments — Langfuse captures what happened, Braintrust measures whether it was good.
Langfuse's SDK wraps the Vercel AI SDK's model calls, capturing every streaming generation with token counts and latency.
→ Per-request observability on all AI calls made through the Vercel AI SDK — cost and quality metrics without changing streaming code.
Add to your GitHub README
[](https://aichitect.dev/tool/langfuse)Explore the full AI landscape
See how Langfuse fits into the bigger picture — browse all 207 tools and their relationships.