Skip to main content

TRL Fine Tuning

MLOps

TRL Fine Tuning

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

Enabled by defaultBuilt In
CLI install commandaegis skills install trl-fine-tuning
Overview

Bundled with the packaged Aegis CLI as a built-in procedural skill.

Already ships inside the packaged Aegis bundle. Use `aegis skills install trl-fine-tuning` only when you want an explicit local materialization record.