Overview

no
Summary: Train models using reinforcement learning in minutes

Original Documentation

Documentation Index#

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt Use this file to discover all available pages before exploring further.

Train models using reinforcement learning in minutes

Reinforcement Fine-Tuning (RFT) is free for models under 16B parameters. When creating an RFT job in the UI, filter for free tuning models in the model selection area on the fine-tuning creation page. If kicking off jobs from the terminal, you can find the model ID from the Model Library. Note: SFT and DPO jobs are billed per training token for all model sizes—see the pricing page for details.

Fireworks RFT helps you train frontier models like DeepSeek V3 and Kimi K2 to outperform closed models for your product use case, using reinforcement learning. Fireworks RFT is powerful and easy to use for developers and enterprises:

  • No infrastructure: Train frontier models without managing GPUs or RL infra
  • Production-ready: Built-in tracing, monitoring, security & one-click deploy
  • Fast iteration: From evaluator setup to deployed model in hours, not weeks

See how Genspark and Vercel used Fireworks RFT to train open models for agentic use cases, outperforming leading closed models.

Quickstart: Pick Your Training Approach#

⏱️ 15 minutes

Best for: Testing locally, simple task training

How it works: Iterate on your evaluator and use it to train a small model on Fireworks.

⏱️ 1-2 hours

Best for: Agents, multi-turn workflows, existing services

How it works: Rollouts happen in your environment. Connect via HTTP with tracing.

⏱️ 2-4 hours

Best for: Sensitive data, compliance, enterprise

How it works: Training data never leaves your GCS/S3 bucket. Full data isolation.

Launch Training#

Requirements, validation checks, and common errors before launching

Fast, scriptable, reproducible. Perfect for automation and iteration

Visual, guided, beginner-friendly. Great for exploring options

Already familiar with firectl? You can create RFT jobs directly.

RFT Concepts#

The RL training loop explained

How reward functions guide training

Local vs remote evaluation environments

Optimize your training configuration

Estimate and optimize your training costs

Link last verified June 7, 2026. View original ↗
Source: Fireworks AI Docs
Link last verified: 2026-06-07