W&B Training

no

Original Documentation

Documentation Index#

Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt Use this file to discover all available pages before exploring further.

Post-train your models using reinforcement learning and supervised fine-tuning

Now in public preview, W&B Training offers serverless post-training for large language models (LLMs), including both reinforcement learning (RL) and supervised fine-tuning (SFT).

  • Serverless RL: Improve model reliability performing multi-turn, agentic tasks while increasing speed and reducing costs. RL is a training technique where models learn to improve their behavior through feedback on their outputs.
  • Serverless SFT: Fine-tune models using curated datasets for distillation, teaching output style and format, or warming up before RL.

W&B Training includes integration with:

To get started, satisfy the prerequisites to start using the service and then see the Serverless RL quickstart or the Serverless SFT docs to learn how to post-train your models.

Link last verified June 7, 2026. View original ↗
Source: Weights & Biases Docs
Link last verified: 2026-04-05