Hugging Face's library for RLHF, DPO, and GRPO fine-tuning
Hugging Face's Transformer Reinforcement Learning library — the standard toolkit for RLHF, DPO, GRPO, and reward modeling. DeepSeek's GRPO technique that sparked the reasoning model wave was popularized through TRL. Integrates seamlessly with the full Hugging Face ecosystem.
Fine-tuning frameworks and platforms for training custom model adaptations with LoRA, QLoRA, or full fine-tuning
Other tools in this slot:
AIchitect's Genome scanner detects TRL in your project via these signals:
trlTRL is published by HuggingFace and is the canonical post-training library on top of the Transformers stack — it consumes HF models, datasets, and accelerate seamlessly.
→ Run RLHF, DPO, and reward-model training natively on top of the HuggingFace Transformers ecosystem.
Crossed 10,000 stars ⭐
3 weeks ago
Crossed 5,000 stars ⭐
3 weeks ago
Crossed 1,000 stars ⭐
3 weeks ago
Add to your GitHub README
[](https://www.aichitect.dev/tool/trl)Explore the full AI landscape
See how TRL fits into the bigger picture — browse all 207 tools and their relationships.