Quickstart
Programming guide
verl.single_controller
Data Preparation
Configurations
PPO Example
Algorithms
PPO Trainer and Workers
Performance Tuning Guide
Adding new models
Advanced Features
Hardware Support
API References
FAQ
Development Notes
Last updated: 06/08/2025 (API docstrings are auto-generated).
Trainers drive the training loop. Introducing new trainer classes in case of new training paradiam is encouraged.