Prompt & EvalOpen Source✦ Free Tier

Inspect

Open-source LLM evaluation framework by the UK AI Safety Institute

⭐ 1,800 stars● Health 85/100 — Active· commit recency (40 pts) · star momentum (30 pts) · issue ratio (20 pts) · forks (10 pts)App Infrastructure

Open in Builder →Website ↗GitHub ↗

About

Inspect is an open-source framework for building LLM evaluations, developed by the UK AI Safety Institute. It provides task composition, built-in datasets, scorers, and solvers for systematic benchmarking of LLM capabilities, safety, and alignment properties.

Choose Inspect when…

•running capability and safety evaluations on LLMs
•building custom benchmarks for model comparison
•need government-backed evaluation methodology

Builder Slot

How do you know it's working?Optional for most stacks

Tests, evals, and experiment tracking to measure and improve your AI output quality

Dev Tools

Not applicable

App Infra

Recommended

Hybrid

Optional

Other tools in this slot:

PromptFoo DeepEval RAGAS Vellum PromptLayer Agenta TruLens Humanloop

Stack Genome Detection

AIchitect's Genome scanner detects Inspect in your project via these signals:

pip packages

inspect-ai

Alternatives to consider (1)

DeepEvalcompare →

Pricing

✦ Free tier available

Open SourceFree

Pulse

● No incidents in the last 90 days

Recent Activity

Health ↑ 75 → 90

3 months ago

Pricing updated

4 months ago

View all activity for this tool →

In 1 stack

Evaluation & Quality Stack

Badge

Add to your GitHub README

[![Inspect](https://www.aichitect.dev/badge/tool/inspect-ai)](https://www.aichitect.dev/tool/inspect-ai)

Explore the full AI landscape

See how Inspect fits into the bigger picture — browse all 207 tools and their relationships.

Explore graph →