Serverless Serving Paths

no
Summary: Standard, Priority, and Fast serving paths on Fireworks Serverless

Original Documentation

Documentation Index#

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt Use this file to discover all available pages before exploring further.

Standard, Priority, and Fast serving paths on Fireworks Serverless

Fireworks Serverless offers three serving paths:

  • Standard is the default serving path. No service_tier parameter is needed.
  • Priority tier is for workloads that require higher reliability during peak traffic.
  • Fast is for workloads that require higher speeds.

Priority tier#

Priority tier is for workloads that require higher reliability during peak traffic periods, at a higher price point. Priority tier is prioritized above Standard traffic and is less likely to be load shed (503 server overloaded).

To use priority tier, set service_tier to "priority". Supported on OpenAI-compatible chat completions and on the Anthropic-compatible messages API:

curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/models/kimi-k2p5",
    "service_tier": "priority",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Priority tier is available on select models. Models and pricing are listed on the Serverless pricing page.

Fast#

Fast is a high-speed serving path, useful for interactive applications that require fast response speeds, at a higher price point. Fast variants aim for 100+ tokens per second of generated throughput. It is not a different model and the quality of the model remains the same.

Fast is available for select models. To use Fast, change the model ID as listed below.

Modelmodel ID
Kimi K2.6 Fastaccounts/fireworks/routers/kimi-k2p6-fast
GLM 5.1 Fastaccounts/fireworks/routers/glm-5p1-fast
curl https://api.fireworks.ai/inference/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FIREWORKS_API_KEY" \
  -d '{
    "model": "accounts/fireworks/routers/kimi-k2p6-fast",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Pricing is listed on the Serverless pricing page.

Link last verified June 7, 2026. View original ↗
Source: Fireworks AI Docs
Link last verified: 2026-06-07