Fine-tuningOpen Source✦ Free Tier

TRL

Hugging Face's library for RLHF, DPO, and GRPO fine-tuning

⭐ 12,000 stars● Health 95/100 — Active· commit recency (40 pts) · star momentum (30 pts) · issue ratio (20 pts) · forks (10 pts)App Infrastructure

Open in Builder →Website ↗GitHub ↗

About

Hugging Face's Transformer Reinforcement Learning library — the standard toolkit for RLHF, DPO, GRPO, and reward modeling. DeepSeek's GRPO technique that sparked the reasoning model wave was popularized through TRL. Integrates seamlessly with the full Hugging Face ecosystem.

Choose TRL when…

•You're training with RLHF, DPO, or GRPO preference-based techniques
•You want the standard library for reward modeling and alignment fine-tuning
•You need tight integration with the Hugging Face ecosystem

Builder Slot

How do you adapt models to your domain?Optional for most stacks

Fine-tuning frameworks and platforms for training custom model adaptations with LoRA, QLoRA, or full fine-tuning

Dev Tools

Not applicable

App Infra

Optional

Hybrid

Optional

Other tools in this slot:

Axolotl Unsloth LlamaFactory Torchtune Predibase

Stack Genome Detection

AIchitect's Genome scanner detects TRL in your project via these signals:

pip packages

trl

Integrates with (1)

HuggingFaceLLM Infrastructure

TRL is published by HuggingFace and is the canonical post-training library on top of the Transformers stack — it consumes HF models, datasets, and accelerate seamlessly.

→ Run RLHF, DPO, and reward-model training natively on top of the HuggingFace Transformers ecosystem.

Compare →