Fireworks Agent Overview ↗

fireworks concept beginner agents fine-tuning cost-management models deployment

Summary: Describe what you want, approve the plan and cost, get a deployed fine-tuned model.

Original Documentation

Documentation Index#
Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt Use this file to discover all available pages before exploring further.

Describe what you want, approve the plan and cost, get a deployed fine-tuned model.

Fireworks Agent is a hosted Fireworks assistant that owns the full fine-tuning loop. You describe what you want — “fine-tune a model that classifies our support tickets”, “improve Llama 3.1 70B on our function-calling data”, “train a smaller model that matches GPT-4 on our routing task” — and Agent picks the base model, prepares the dataset, runs a hyperparameter sweep, submits training, evaluates the result, and deploys the fine-tuned model. You stay in the loop for approvals and final calls; everything else is handled.

Agent is the easiest of the three Fireworks fine-tuning paths, sitting alongside Managed Fine-Tuning and the Training API. It’s the right starting point when you want a working fine-tuned model without writing config files or Python training loops.

Naming. This documentation refers to the product as Fireworks Agent (or just Agent). You may also see it called pilot in internal source code, in CLI permission presets (--permission-preset=pilot), in the embedded manifest file (pilot.yaml), and in some legacy support contexts — those are all the same product. Use “Fireworks Agent” or “Agent” in your own prompts and communication.

What Agent does for you#

<span class=“card-start” data-card-raw=“title=“Picks the right model” icon=“sparkles”"> Agent recommends a base model and tuning method (SFT, DPO, or classification) from your task description and a peek at your data.

<span class=“card-start” data-card-raw=“title=“Plans the run” icon=“list-check”"> Inspects your dataset, proposes hyperparameters, estimates cost, and presents a single plan for approval before any spend.

<span class=“card-start” data-card-raw=“title=“Runs and evaluates” icon=“play”"> Submits the job, streams progress, evaluates checkpoints, and ships a deployed model at the end.

Concretely, Agent can:

Run SFT, DPO, and classification jobs from a natural-language prompt
Inspect your dataset and call out format issues before training starts
Recommend a base model from a curated panel based on your task shape
Run a short hyperparameter sweep before committing to full training
Stream a live progress feed with eval loss, cost-so-far, and ETA
Evaluate the trained model against a held-out set and surface the best checkpoint
Deploy the fine-tuned model so you can call it from chat/completions immediately
Author task-specific evaluators for use in SFT sweeps, or Eval Protocol evaluators you can then run through Managed Fine-Tuning’s RFT path
Answer questions about your account, deployments, jobs, and Fireworks models along the way

Agent does not run RFT training itself — for that, author the evaluator with Agent and then submit the RFT job through Managed Fine-Tuning. Agent also cannot run an arbitrary Python training loop, use a custom loss function, or sample mid-training from your own evaluator — for those, use the Training API directly.

Architecture#

flowchart LR
    Client["Client<br/>(user via web app,<br/>user via firectl / REST API,<br/>or coding agent)"] -->|"create session"| AgentAPI["Fireworks Agent API"]
    AgentAPI -->|dispatch| Runner["Session Runner"]
    Runner -->|"plan + cost estimate"| AgentAPI
    AgentAPI -->|"events stream"| Client
    Client -->|"approve / answer"| AgentAPI
    AgentAPI -->|"session update"| Runner
    Runner -->|"firectl + Fireworks API"| Platform["Fireworks Platform"]
    Platform -->|results| Runner
    Runner -->|"final report + deployed model"| Client

The runner is an ephemeral, sandboxed environment with its own filesystem. It executes Agent’s plan against your Fireworks account using your API key. Sessions can pause for hours or days waiting on user input without consuming compute.

Two ways to use Agent#

The default — and recommended — surface for most users. Open Agent in the left nav of app.fireworks.ai for a chat interface that streams Agent’s plan, progress, and final report. Best for:

Most fine-tuning workflows, end to end
Teams that want a visual plan, cost, and approval UX
Watching a long training run with a live progress feed
Skipping firectl installation and service-account setup

Dashboard quickstart#

Click Agent in the left navigation at app.fireworks.ai.

A good first prompt is specific about what you’re training for, what data to use, and what success looks like:

```text
    Fine-tune a model on accounts/your-account/datasets/support-tickets.
    Classify each ticket into one of 12 categories.
    Target: better than GPT-4 mini on accuracy. Budget: under $5.
    ```

Agent will inspect the dataset, propose a plan, and stop for your approval.

Agent presents one structured plan with a cost estimate. Approve, request a change (“use Qwen3 32B instead”, “skip HP tuning”), or cancel. No spend happens before this gate.

Agent streams phase-anchored updates every few minutes through the final report, which includes the deployed model ID and inference endpoint.

The advanced path, for power users and anyone already living in a coding-agent harness. Use it two ways:

Drive Agent directly from firectl session — script it, run it from CI, or call the REST API.
Let Claude Code, Cursor, Codex, Aider, Goose, or another coding agent drive it for you by installing the Fireworks Agent skill file. The coding agent shells out to firectl session using a scoped service-account key.

Best for:

Fine-tuning as a step in a larger coding workflow
Reproducing a training run with code-checked-in instructions
Power users who already orchestrate everything from their coding agent or terminal
Scripting and automation against the firectl session / REST API

CLI quickstart#

Create a service account scoped to Agent’s capabilities (the pilot permission preset — see the security section below for the rationale) and mint an API key:

```bash
    firectl -a <account> user create \
      --service-account \
      --user-id=fireworks-agent \
      --permission-preset=pilot

    firectl -a <account> api-key create --service-account=fireworks-agent
    ```

Save the returned key in a `.env` file in your project root:

```bash
    FIREWORKS_AGENT_API_KEY=fw-...
    ```

The Fireworks Agent skill sources `.env` automatically. See [Service Accounts](/accounts/service-accounts) for the full setup.

bash source .env && firectl session create \ --api-key $FIREWORKS_AGENT_API_KEY \ --instruction "Run SFT on Qwen3 32B using accounts/myacct/datasets/mydata"

The command returns a session ID, for example `abc123`.

bash source .env && firectl session events abc123 --api-key $FIREWORKS_AGENT_API_KEY --wait

The `--wait` flag keeps streaming until the session reaches `waiting`, `succeeded`, `failed`, or `cancelled`. Without it, the command dumps existing events and exits.

When the stream stops at waiting, read Agent’s question, then send your answer back to the same session:

```bash
    source .env && firectl session update abc123 \
      --api-key $FIREWORKS_AGENT_API_KEY \
      --instruction "Approved, proceed."
    ```

Re-run `firectl session events abc123 --wait` to resume. Repeat until the session reports `succeeded`.

How Agent runs a training job#

Every Agent session moves through the same seven phases. Coding agents should expect this sequence; humans can use it as a mental model for what to expect next.

#	Phase	What happens
1	Data inspection	Agent reads your dataset, reports format, sample count, token count, and any issues.
2	Planning & approval	Agent proposes base model, tuning method, hyperparameters, eval path, and a cost estimate. You approve, edit, or cancel.
3	HP tuning	A short parallel sweep (typically 3 configs) over LoRA rank and learning rate, capped at 6 active jobs by default.
4	Full training	The best config from phase 3 runs to completion on the full dataset, with per-epoch eval loss.
5	Evaluation	The trained model is evaluated against a held-out set using one of three strategies you pick in phase 2: validation loss only (default), an evaluator you provide, or an evaluator Agent generates for you.
6	Deployment	The model is deployed and a `fireworks-ai` SDK snippet is ready for inference.
7	Final report	Deployed model ID, key metrics, total cost, and per-phase summary in one message.

DPO uses the same shape with phase 3 replaced by a preference sweep (or pair generation followed by a preference sweep when the dataset is prompts-only). Classification uses the same shape with phase 3 expanded into a base-model benchmark plus a fine-tuning sweep, and phase 5 reports per-label and overall accuracy. The promotion gate between phase 3 and phase 4 is one of the two user-facing pauses (the other is plan approval in phase 2).

The approval and cost contract#

Agent never spends without an explicit approval. This is structural, not a setting.

At the end of Phase 2 (Planning) — and again before any new spend-incurring step — Agent surfaces a structured cost preview and waits for approval. In the dashboard this is a yes/no prompt. From a coding agent, the skill holds the session in a waiting state, surfaces Agent’s exact question, and only proceeds after you respond via firectl session update. Reject and the session ends with no charges.

The preview always includes:

Total estimated cost (in USD, with a confidence range)
Estimated wall time
Per-phase cost breakdown (HP tuning / full training / evaluation / deployment)
Cost-so-far in the session (for re-approvals on long runs)

Out-of-coverage behavior#

If you ask Agent to use a model or method outside its supported set, it refuses rather than silently approximating. For example, asking for full-parameter tuning on a model with no Agent recipe returns a clear “not supported in Agent — use Managed Fine-Tuning or the Training API” message with a pointer to the right surface. See When not to use Agent.

What Agent can do today#

End-to-end SFT with dataset inspection, hyperparameter sweep, evaluator-guided model selection, and a deployed winner.

Run DPO or ORPO on pre-paired preferences or generate pairs automatically with delta learning, with an optional base-model sweep.

Benchmark base models, fine-tune on labeled data, and compare base vs fine-tuned classification accuracy on a held-out split.

Generate a reusable Python evaluator Agent uses to score candidates during an SFT sweep, or an Eval Protocol evaluator you can take to a Managed RFT job — directly from your dataset.

Copy-paste skill files for Claude Code, Cursor, Codex, Aider, and Goose so they can drive Agent for you.

Agent vs Managed Fine-Tuning vs Training API#

All three sit on the same training infrastructure, GPU shapes, and tuning methods. The difference is how much you drive.

	Fireworks Agent	Managed Fine-Tuning	Training API
Interface	Natural language (dashboard chat, `firectl session`, or via coding agent)	UI, `firectl`, REST	Python script
Who picks the model	Agent recommends	You	You
Who tunes hyperparameters	Agent runs a sweep	You set them	You set them
Cost approval	Built-in gate	None — you submit jobs directly	None
Custom loss / training loop	Not supported	Not supported	Supported
Inference-in-the-loop eval	Not supported	Not supported	Supported (hotload)
Best for	Getting a working fine-tuned model fast, without ML expertise	Production runs with known config	Research, custom RL, hybrid losses

When not to use Agent#

Reach for a more direct surface when:

You need a custom loss function or hybrid objective → Training API
You need to hotload checkpoints for mid-training inference evaluation → Training API
You already know your config and just want to submit a job → Managed Fine-Tuning
You need full-parameter tuning on a model Agent doesn’t cover → Managed Fine-Tuning
You’re training in a fully automated CI pipeline with no human approval → Agent’s approval gate is interactive by design; Managed Fine-Tuning is the better fit today

Security: service accounts and the Agent manifest#

When a coding agent drives Fireworks Agent on your behalf, it should authenticate as a service account with the pilot permission preset, not your personal user key. This enforces a layered permissions model:

Effective permissions = User role ∩ Agent capability manifest

The manifest is a real artifact#

The Agent capability manifest is a versioned YAML file (pilot.yaml, kept under its original internal name) embedded into the Fireworks control-plane binary at build time. It enumerates the exact set of RPC methods the pilot preset is allowed to call — roughly 80 methods grouped by capability surface:

Account & billing — GetAccountUsage, GetQuota, ListQuotas, ListCosts
Models — GetModel, ListModels, CreateModelVersion, PrepareModel, ValidateModelUpload
Deployments — GetDeployment, CreateDeployment, DeployModelVersion, GetDeploymentMetrics
Datasets — CreateDataset, GetDataset, ListDatasets, PreviewDataset, SplitDataset
Evaluators and evaluations — CreateEvaluator, GetEvaluator, CreateEvaluation, TestEvaluation
Fine-tuning jobs — CreateSupervisedFineTuningJob, CreateDpoJob, CreateReinforcementFineTuningJob, CreateRlorTrainerJob ^{(the RFT and RLOR-trainer RPCs are granted by the manifest but Agent’s current workflows don’t use them — see What Agent does for you)}
Training shapes — GetTrainingShape, ListTrainingShapes
Batch inference and inference logs — CreateBatchInferenceJob, ListInferenceLogs

The control plane enforces the manifest as a hard ceiling before checking the underlying user’s role: even if the user has broader permissions, the preset cannot exceed what the manifest allows. Any RPC outside the manifest returns PERMISSION_DENIED at the API gateway, regardless of how the request was constructed.

Non-destructive guarantee, structurally enforced#

Agent’s promise to never delete, cancel, or destroy your existing resources is enforced by the manifest itself, not by skill-level politeness. The manifest does not include any Delete*, Cancel*, or destructive RPC methods. Even a malicious or hallucinated tool call targeting DeleteModel, CancelReinforcementFineTuningJob, or DeleteDeployment is rejected at the control plane before it reaches the resource layer.

Cross-account reads, never cross-account writes#

The pilot preset is granted read-only access across accounts. This is what lets Agent reach Fireworks-owned public resources — base models at accounts/fireworks/models/..., public deployment shapes, public datasets — using only your account’s API key. Agent cannot write into any other account; mutating operations are scoped to your account.

Auto-update on control-plane releases#

Because the manifest is compiled into the control-plane binary, expanded Agent capabilities ship automatically with every control-plane deploy. Your service account stores only the preset name (pilot), not the list of allowed methods — so new capabilities are picked up without rotating keys or re-provisioning the service account. See Service Accounts for setup details.

Session lifecycle reference#

Command	What it does	Confirmation required
`firectl session create --instruction "<instruction>"`	Start a new session	No
`firectl session events <id> --wait`	Stream events until terminal or waiting state	No
`firectl session get <id>`	Get current status and details	No
`firectl session list`	List sessions for your account	No
`firectl session update <id> --instruction "<answer>"`	Send a response to a waiting session	Yes — confirm with the user
`firectl session cancel <id>`	Stop a running session (keeps the record)	Yes — confirm with the user
`firectl session delete <id>`	Remove the session record (irreversible)	Yes — confirm with the user

All commands accept --api-key $FIREWORKS_AGENT_API_KEY for non-interactive auth and --scope optimize (the default scope).

Troubleshooting#

Agent shares the on-demand pool with the Training API. If GPU capacity is tight, jobs queue. If you need guaranteed capacity, [request a reservation](https://fireworks.ai/contact). Agent only runs methods it has curated recipes for. For anything outside that set, use [Managed Fine-Tuning](/fine-tuning/managed-finetuning-intro) or the [Training API](/fine-tuning/training-api/introduction). You're missing the `--wait` flag. Without it, `firectl session events` prints existing events and returns. The Fireworks Agent skill always passes `--wait`, which keeps the stream open until the session reaches `waiting`, `succeeded`, `failed`, or `cancelled`. If you're driving `firectl` directly, add `-w / --wait`. Agent's preview includes HP tuning, full training, evaluation, and the first hour of deployment. Reject the plan and ask Agent to skip HP tuning or use a smaller base model — the next preview will reflect the lower scope.

Next steps#

Open Agent in the left nav at app.fireworks.ai.

Install the skill file in Claude Code, Cursor, Codex, Aider, or Goose.

Drive the same training infra directly when you know your config.

Write your own Python training loop on Fireworks GPUs.

Agent crib notes

Auth: set FIREWORKS_AGENT_API_KEY in a project-local .env (the key is from a service account with the pilot permission preset). Source it via source .env && ... and pass on every command as --api-key $FIREWORKS_AGENT_API_KEY.
Use the same session ID for follow-ups. Never create a new session to continue an existing conversation.
Always pass --wait to session events, or the command exits immediately after dumping history.
create, get, events, and list are safe to run without user confirmation. Always confirm with the user before update, cancel, or delete.
On waiting, surface Agent’s exact question to the user verbatim; do not paraphrase.
See Use with coding agents for a complete copy-paste skill for Claude Code, Cursor, Codex, Aider, and Goose.

Link last verified June 7, 2026. View original ↗

Source: Fireworks AI Docs

Link last verified: 2026-06-07