Serverless Quickstart

yes
Summary: Make your first Serverless API call in minutes

Editorial Notes

This quickstart is the fastest path to your first Fireworks inference call across several SDKs, and it matters because Fireworks exposes an OpenAI-compatible API, so most existing OpenAI client code works with a base-URL swap. Pay attention to setting the FIREWORKS_API_KEY environment variable up front, since every example depends on it — a trivial but common first stumble. The OpenAI compatibility is the real draw: porting from OpenAI or Together AI is mostly configuration rather than rewriting. Read the fine-tuning intro next if you plan to customize a model rather than just call one.


Original Documentation

Documentation Index#

Fetch the complete documentation index at: https://docs.fireworks.ai/llms.txt Use this file to discover all available pages before exploring further.

Make your first Serverless API call in minutes

Serverless is the fastest way to get started with using open models. This quickstart will help you make your first API call in minutes.

Step 1: Create and export an API key#

Before you begin, create an API key in the Fireworks dashboard. Click Create API key and store it in a safe location.

Once you have your API key, export it as an environment variable in your terminal:

    export FIREWORKS_API_KEY="your_api_key_here"
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Windows"></span>
```powershell
    setx FIREWORKS_API_KEY "your_api_key_here"
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

## Step 2: Make your first Serverless API call

<span class="tab-group-start"></span>
  <span class="tab-start" data-tab-title="Python (Fireworks SDK)"></span>
Install the [Fireworks Python SDK](/tools-sdks/python-sdk):

<span class="callout-start" data-callout-type="note"></span>
  The SDK is currently in alpha. Use the `--pre` flag when installing to get the latest version.
<span class="callout-end"></span>


  ```bash
      pip install --pre fireworks-ai
      ```

  ```bash
      poetry add --pre fireworks-ai
      ```

  ```bash
      uv add --pre fireworks-ai
      ```


Then make your first Serverless API call:

```python
    from fireworks import Fireworks

    client = Fireworks()

    response = client.chat.completions.create(
      model="accounts/fireworks/models/deepseek-v3p1",
      messages=[{
        "role": "user",
        "content": "Say hello in Spanish",
      }],
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (OpenAI SDK)"></span>
Fireworks provides an OpenAI compatible endpoint. Install the [OpenAI Python SDK](https://github.com/openai/openai-python):

```bash
    pip install openai
    ```

Then make your first Serverless API call:

```python
    import os
    from openai import OpenAI

    client = OpenAI(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference/v1"
    )

    response = client.chat.completions.create(
        model="accounts/fireworks/models/deepseek-v3p1",
        messages=[{
            "role": "user",
            "content": "Say hello in Spanish",
        }],
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (Anthropic SDK)"></span>
Fireworks provides an Anthropic compatible endpoint. Install the [Anthropic Python SDK](https://github.com/anthropics/anthropic-sdk-python):

```bash
    pip install anthropic
    ```

Then make your first Serverless API call:

```python
    import os
    import anthropic

    client = anthropic.Anthropic(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference"
    )

    response = client.messages.create(
        model="accounts/fireworks/models/deepseek-v3p1",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": "Say hello in Spanish",
        }],
    )

    print(response.content[0].text)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (OpenAI SDK)"></span>
Fireworks provides an OpenAI compatible endpoint. Install the [OpenAI JavaScript / TypeScript SDK](https://github.com/openai/openai-node):

```bash
    npm install openai
    ```

Then make your first Serverless API call:

```javascript
    import OpenAI from "openai";

    const client = new OpenAI({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference/v1",
    });

    const response = await client.chat.completions.create({
      model: "accounts/fireworks/models/deepseek-v3p1",
      messages: [
        {
          role: "user",
          content: "Say hello in Spanish",
        },
      ],
    });

    console.log(response.choices[0].message.content);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (Anthropic SDK)"></span>
Fireworks provides an Anthropic compatible endpoint. Install the [Anthropic JavaScript / TypeScript SDK](https://github.com/anthropics/anthropic-sdk-typescript):

```bash
    npm install @anthropic-ai/sdk
    ```

Then make your first Serverless API call:

```javascript
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference",
    });

    const response = await client.messages.create({
      model: "accounts/fireworks/models/deepseek-v3p1",
      max_tokens: 1024,
      messages: [
        {
          role: "user",
          content: "Say hello in Spanish",
        },
      ],
    });

    console.log(response.content[0].text);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="curl"></span>
```bash
    curl https://api.fireworks.ai/inference/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $FIREWORKS_API_KEY" \
      -d '{
        "model": "accounts/fireworks/models/deepseek-v3p1",
        "messages": [
          {
            "role": "user",
            "content": "Say hello in Spanish"
          }
        ]
      }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

You should see a response like: `"¡Hola!"`

<span class="callout-start" data-callout-type="tip"></span>
  For **Priority tier** (`service_tier: "priority"`) and **Fast**, see [Serverless Serving Paths](/serverless/serving-paths).
<span class="callout-end"></span>

## Common use cases

### Streaming responses

Stream responses token-by-token for a better user experience:

<span class="tab-group-start"></span>
  <span class="tab-start" data-tab-title="Python (Fireworks SDK)"></span>
```python
    from fireworks import Fireworks

    client = Fireworks()

    stream = client.chat.completions.create(
      model="accounts/fireworks/models/deepseek-v3p1",
      messages=[{"role": "user", "content": "Tell me a short story"}],
      stream=True
    )

    for chunk in stream:
      if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (OpenAI SDK)"></span>
```python
    import os
    from openai import OpenAI

    client = OpenAI(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference/v1"
    )

    stream = client.chat.completions.create(
        model="accounts/fireworks/models/deepseek-v3p1",
        messages=[{"role": "user", "content": "Tell me a short story"}],
        stream=True
    )

    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (Anthropic SDK)"></span>
```python
    import os
    import anthropic

    client = anthropic.Anthropic(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference"
    )

    with client.messages.stream(
        model="accounts/fireworks/models/deepseek-v3p1",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Tell me a short story"}],
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (OpenAI SDK)"></span>
```javascript
    import OpenAI from "openai";

    const client = new OpenAI({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference/v1",
    });

    const stream = await client.chat.completions.create({
      model: "accounts/fireworks/models/deepseek-v3p1",
      messages: [{ role: "user", content: "Tell me a short story" }],
      stream: true,
    });

    for await (const chunk of stream) {
      process.stdout.write(chunk.choices[0]?.delta?.content || "");
    }
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (Anthropic SDK)"></span>
```javascript
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference",
    });

    const stream = client.messages.stream({
      model: "accounts/fireworks/models/deepseek-v3p1",
      max_tokens: 1024,
      messages: [{ role: "user", content: "Tell me a short story" }],
    });

    stream.on("text", (text) => {
      process.stdout.write(text);
    });

    await stream.finalMessage();
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="curl"></span>
```bash
    curl https://api.fireworks.ai/inference/v1/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $FIREWORKS_API_KEY" \
        -d '{
        "model": "accounts/fireworks/models/deepseek-v3p1",
        "messages": [
            {
            "role": "user",
            "content": "Tell me a short story"
            }
        ],
        "stream": true
        }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

### Function calling

Connect your models to external tools and APIs:

<span class="tab-group-start"></span>
  <span class="tab-start" data-tab-title="Python (Fireworks SDK)"></span>
```python
    from fireworks import Fireworks

    client = Fireworks()

    response = client.chat.completions.create(
        model="accounts/fireworks/models/kimi-k2-instruct-0905",
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"}
        ],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "Get the current weather for a location",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "City name, e.g. San Francisco",
                            }
                        },
                        "required": ["location"],
                    },
                },
            },
        ],
    )

    print(response.choices[0].message.tool_calls)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (OpenAI SDK)"></span>
```python
    import os
    from openai import OpenAI

    client = OpenAI(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference/v1",
    )

    response = client.chat.completions.create(
        model="accounts/fireworks/models/kimi-k2-instruct-0905",
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"}
        ],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "description": "Get the current weather for a location",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "City name, e.g. San Francisco",
                            }
                        },
                        "required": ["location"],
                    },
                },
            },
        ],
    )

    print(response.choices[0].message.tool_calls)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (Anthropic SDK)"></span>
```python
    import os
    import anthropic

    client = anthropic.Anthropic(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference"
    )

    response = client.messages.create(
        model="accounts/fireworks/models/kimi-k2-instruct-0905",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"}
        ],
        tools=[
            {
                "name": "get_weather",
                "description": "Get the current weather for a location",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City name, e.g. San Francisco",
                        }
                    },
                    "required": ["location"],
                },
            },
        ],
    )

    for block in response.content:
        if block.type == "tool_use":
            print(f"Tool: {block.name}, Input: {block.input}")
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (OpenAI SDK)"></span>
```javascript
    import OpenAI from "openai";

    const client = new OpenAI({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference/v1",
    });

    const tools = [
      {
        type: "function",
        function: {
          name: "get_weather",
          description: "Get the current weather for a location",
          parameters: {
            type: "object",
            properties: {
              location: {
                type: "string",
                description: "City name, e.g. San Francisco",
              },
            },
            required: ["location"],
          },
        },
      },
    ];

    const response = await client.chat.completions.create({
      model: "accounts/fireworks/models/kimi-k2-instruct-0905",
      messages: [{ role: "user", content: "What's the weather in Paris?" }],
      tools: tools,
    });

    console.log(response.choices[0].message.tool_calls);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (Anthropic SDK)"></span>
```javascript
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference",
    });

    const response = await client.messages.create({
      model: "accounts/fireworks/models/kimi-k2-instruct-0905",
      max_tokens: 1024,
      messages: [{ role: "user", content: "What's the weather in Paris?" }],
      tools: [
        {
          name: "get_weather",
          description: "Get the current weather for a location",
          input_schema: {
            type: "object",
            properties: {
              location: {
                type: "string",
                description: "City name, e.g. San Francisco",
              },
            },
            required: ["location"],
          },
        },
      ],
    });

    for (const block of response.content) {
      if (block.type === "tool_use") {
        console.log(`Tool: ${block.name}, Input:`, block.input);
      }
    }
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="curl"></span>
```bash
    curl https://api.fireworks.ai/inference/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $FIREWORKS_API_KEY" \
      -d '{
        "model": "accounts/fireworks/models/kimi-k2-instruct-0905",
        "messages": [
          {
            "role": "user",
            "content": "What'\''s the weather in Paris?"
          }
        ],
        "tools": [
          {
            "type": "function",
            "function": {
              "name": "get_weather",
              "description": "Get the current weather for a location",
              "parameters": {
                "type": "object",
                "properties": {
                  "location": {
                    "type": "string",
                    "description": "City name, e.g. San Francisco"
                  }
                },
                "required": ["location"]
              }
            }
          }
        ]
      }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

[Learn more about function calling →](/guides/function-calling)

### Structured outputs (JSON mode)

Get reliable JSON responses that match your schema:

<span class="tab-group-start"></span>
  <span class="tab-start" data-tab-title="Python (Fireworks SDK)"></span>
```python
    from fireworks import Fireworks

    client = Fireworks()

    response = client.chat.completions.create(
      model="accounts/fireworks/models/deepseek-v3p1",
      messages=[
        {
          "role": "user",
          "content": "Extract the name and age from: John is 30 years old",
        }
      ],
      response_format={
        "type": "json_schema",
        "json_schema": {
          "name": "person",
          "schema": {
            "type": "object",
            "properties": {
              "name": { "type": "string" },
              "age": { "type": "number" }
            },
            "required": ["name", "age"],
          },
        },
      },
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (OpenAI SDK)"></span>
```python
    import os
    from openai import OpenAI

    client = OpenAI(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference/v1",
    )

    response = client.chat.completions.create(
        model="accounts/fireworks/models/deepseek-v3p1",
        messages=[
            {
                "role": "user",
                "content": "Extract the name and age from: John is 30 years old",
            }
        ],
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": "person",
                "schema": {
                    "type": "object",
                    "properties": {"name": {"type": "string"}, "age": {"type": "number"}},
                    "required": ["name", "age"],
                },
            },
        },
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (Anthropic SDK)"></span>
```python
    import os
    import anthropic

    client = anthropic.Anthropic(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference"
    )

    response = client.messages.create(
        model="accounts/fireworks/models/deepseek-v3p1",
        max_tokens=1024,
        output_config={
            "format": {
                "type": "json_schema",
                "schema": {
                    "type": "object",
                    "properties": {
                        "name": { "type": "string" },
                        "age": { "type": "number" }
                    },
                    "required": ["name", "age"],
                },
            }
        },
        messages=[
            {
                "role": "user",
                "content": "Extract the name and age from: John is 30 years old",
            }
        ],
    )

    print(response.content[0].text)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (OpenAI SDK)"></span>
```javascript
    import OpenAI from "openai";

    const client = new OpenAI({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference/v1",
    });

    const response = await client.chat.completions.create({
      model: "accounts/fireworks/models/deepseek-v3p1",
      messages: [
        {
          role: "user",
          content: "Extract the name and age from: John is 30 years old",
        },
      ],
      response_format: {
        type: "json_schema",
        json_schema: {
          name: "person",
          schema: {
            type: "object",
            properties: {
              name: { type: "string" },
              age: { type: "number" },
            },
            required: ["name", "age"],
          },
        },
      },
    });

    console.log(response.choices[0].message.content);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (Anthropic SDK)"></span>
```javascript
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference",
    });

    const response = await client.messages.create({
      model: "accounts/fireworks/models/deepseek-v3p1",
      max_tokens: 1024,
      output_config: {
        format: {
          type: "json_schema",
          schema: {
            type: "object",
            properties: {
              name: { type: "string" },
              age: { type: "number" },
            },
            required: ["name", "age"],
          },
        },
      },
      messages: [
        {
          role: "user",
          content: "Extract the name and age from: John is 30 years old",
        },
      ],
    });

    console.log(response.content[0].text);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="curl"></span>
```bash
    curl https://api.fireworks.ai/inference/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $FIREWORKS_API_KEY" \
      -d '{
        "model": "accounts/fireworks/models/deepseek-v3p1",
        "messages": [
          {
            "role": "user",
            "content": "Extract the name and age from: John is 30 years old"
          }
        ],
        "response_format": {
          "type": "json_schema",
          "json_schema": {
            "name": "person",
            "schema": {
              "type": "object",
              "properties": {
                "name": {
                  "type": "string"
                },
                "age": {
                  "type": "number"
                }
              },
              "required": ["name", "age"]
            }
          }
        }
      }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

[Learn more about structured outputs →](/structured-responses/structured-response-formatting)

### Reasoning

Some models support reasoning, where the model shows its thought process before giving the final answer:

<span class="tab-group-start"></span>
  <span class="tab-start" data-tab-title="Python (Fireworks SDK)"></span>
```python
    from fireworks import Fireworks

    client = Fireworks()

    response = client.chat.completions.create(
        model="accounts/fireworks/models/deepseek-v3p2",
        messages=[
            {"role": "user", "content": "What is 25 * 37? Show your work."}
        ],
        reasoning_effort="medium",
    )

    msg = response.choices[0].message
    if msg.reasoning_content:
        print("Reasoning:", msg.reasoning_content)
    print("Answer:", msg.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (OpenAI SDK)"></span>
```python
    import os
    from openai import OpenAI

    client = OpenAI(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference/v1",
    )

    response = client.chat.completions.create(
        model="accounts/fireworks/models/deepseek-v3p2",
        messages=[
            {"role": "user", "content": "What is 25 * 37? Show your work."}
        ],
        extra_body={"reasoning_effort": "medium"},
    )

    msg = response.choices[0].message
    # Reasoning content is returned in a separate field
    reasoning = getattr(msg, "reasoning_content", None)
    if reasoning is None and hasattr(msg, "model_extra"):
        reasoning = msg.model_extra.get("reasoning_content")

    if reasoning:
        print("Reasoning:", reasoning)
    print("Answer:", msg.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (Anthropic SDK)"></span>
The Anthropic SDK uses the `thinking` parameter to enable reasoning:

```python
    import os
    import anthropic

    client = anthropic.Anthropic(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference"
    )

    response = client.messages.create(
        model="accounts/fireworks/models/deepseek-v3p2",
        max_tokens=16000,
        thinking={"type": "enabled", "budget_tokens": 4096},
        messages=[
            {"role": "user", "content": "What is 25 * 37? Show your work."}
        ],
    )

    for block in response.content:
        if block.type == "thinking":
            print("Thinking:", block.thinking)
        elif block.type == "text":
            print("Answer:", block.text)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (OpenAI SDK)"></span>
```javascript
    import OpenAI from "openai";

    const client = new OpenAI({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference/v1",
    });

    const response = await client.chat.completions.create({
      model: "accounts/fireworks/models/deepseek-v3p2",
      messages: [
        { role: "user", content: "What is 25 * 37? Show your work." },
      ],
      reasoning_effort: "medium",
    });

    const msg = response.choices[0].message;
    if (msg.reasoning_content) {
      console.log("Reasoning:", msg.reasoning_content);
    }
    console.log("Answer:", msg.content);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (Anthropic SDK)"></span>
The Anthropic SDK uses the `thinking` parameter to enable reasoning:

```javascript
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference",
    });

    const response = await client.messages.create({
      model: "accounts/fireworks/models/deepseek-v3p2",
      max_tokens: 16000,
      thinking: { type: "enabled", budget_tokens: 4096 },
      messages: [
        { role: "user", content: "What is 25 * 37? Show your work." },
      ],
    });

    for (const block of response.content) {
      if (block.type === "thinking") {
        console.log("Thinking:", block.thinking);
      } else if (block.type === "text") {
        console.log("Answer:", block.text);
      }
    }
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="curl"></span>
```bash
    curl https://api.fireworks.ai/inference/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $FIREWORKS_API_KEY" \
      -d '{
        "model": "accounts/fireworks/models/deepseek-v3p2",
        "messages": [
          {
            "role": "user",
            "content": "What is 25 * 37? Show your work."
          }
        ],
        "reasoning_effort": "medium"
      }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

[Learn more about reasoning →](/guides/reasoning)

### Vision models

Analyze images with vision-language models:

<span class="tab-group-start"></span>
  <span class="tab-start" data-tab-title="Python (Fireworks SDK)"></span>
```python
    from fireworks import Fireworks

    client = Fireworks()

    response = client.chat.completions.create(
      model="accounts/fireworks/models/qwen2p5-vl-32b-instruct",
      messages=[
        {
          "role": "user",
          "content": [
            {"type": "text", "text": "What's in this image?"},
            {
              "type": "image_url",
              "image_url": {
                "url": "https://storage.googleapis.com/fireworks-public/image_assets/fireworks-ai-wordmark-color-dark.png"
              },
            },
          ],
        }
      ],
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (OpenAI SDK)"></span>
```python
    import os
    from openai import OpenAI

    client = OpenAI(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference/v1"
    )

    response = client.chat.completions.create(
        model="accounts/fireworks/models/qwen2p5-vl-32b-instruct",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "What's in this image?"},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": "https://storage.googleapis.com/fireworks-public/image_assets/fireworks-ai-wordmark-color-dark.png"
                        },
                    },
                ],
            }
        ],
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Python (Anthropic SDK)"></span>
The Anthropic SDK uses its native image format with `type: "image"` and a `source` object:

```python
    import os
    import anthropic

    client = anthropic.Anthropic(
        api_key=os.environ.get("FIREWORKS_API_KEY"),
        base_url="https://api.fireworks.ai/inference"
    )

    response = client.messages.create(
        model="accounts/fireworks/models/qwen2p5-vl-32b-instruct",
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "What's in this image?"},
                    {
                        "type": "image",
                        "source": {
                            "type": "url",
                            "url": "https://storage.googleapis.com/fireworks-public/image_assets/fireworks-ai-wordmark-color-dark.png",
                        },
                    },
                ],
            }
        ],
    )

    for block in response.content:
        if block.type == "text":
            print(block.text)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (OpenAI SDK)"></span>
```javascript
    import OpenAI from "openai";

    const client = new OpenAI({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference/v1",
    });

    const response = await client.chat.completions.create({
      model: "accounts/fireworks/models/qwen2p5-vl-32b-instruct",
      messages: [
        {
          role: "user",
          content: [
            { type: "text", text: "What's in this image?" },
            {
              type: "image_url",
              image_url: {
                url: "https://storage.googleapis.com/fireworks-public/image_assets/fireworks-ai-wordmark-color-dark.png",
              },
            },
          ],
        },
      ],
    });

    console.log(response.choices[0].message.content);
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="JavaScript (Anthropic SDK)"></span>
```javascript
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.FIREWORKS_API_KEY,
      baseURL: "https://api.fireworks.ai/inference",
    });

    const response = await client.messages.create({
      model: "accounts/fireworks/models/qwen2p5-vl-32b-instruct",
      max_tokens: 1024,
      messages: [
        {
          role: "user",
          content: [
            { type: "text", text: "What's in this image?" },
            {
              type: "image",
              source: {
                type: "url",
                url: "https://storage.googleapis.com/fireworks-public/image_assets/fireworks-ai-wordmark-color-dark.png",
              },
            },
          ],
        },
      ],
    });

    for (const block of response.content) {
      if (block.type === "text") {
        console.log(block.text);
      }
    }
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="curl"></span>
```bash
    curl https://api.fireworks.ai/inference/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $FIREWORKS_API_KEY" \
      -d '{
        "model": "accounts/fireworks/models/qwen2p5-vl-32b-instruct",
        "messages": [
          {
            "role": "user",
            "content": [
              {
                "type": "text",
                "text": "What'\''s in this image?"
              },
              {
                "type": "image_url",
                "image_url": {
                  "url": "https://storage.googleapis.com/fireworks-public/image_assets/fireworks-ai-wordmark-color-dark.png"
                }
              }
            ]
          }
        ]
      }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

[Learn more about vision models →](/guides/querying-vision-language-models)

## Learn more about Serverless

For the model lifecycle policy, billing details, and serverless-specific request/response behavior, see the [Serverless overview](/serverless/overview).

## Next steps

Ready to scale to production, explore other modalities, or customize your models?

<span class="card-group-start" data-cols="2"></span>
  <span class="card-start" data-card-title="Deploy and autoscale on Dedicated GPUs" data-card-href="/guides/ondemand-deployments"></span>
Deploy with high performance on dedicated GPUs with fast autoscaling and minimal cold starts
  <span class="card-end"></span>

  <span class="card-start" data-card-title="Fine-tune Models" data-card-href="/fine-tuning/finetuning-intro"></span>
Improve model quality with supervised and reinforcement learning
  <span class="card-end"></span>

  <span class="card-start" data-card-title="Embeddings & Reranking" data-card-href="/guides/querying-embeddings-models"></span>
Use embeddings & reranking in search & context retrieval
  <span class="card-end"></span>

  <span class="card-start" data-card-title="Batch Inference" data-card-href="/guides/batch-inference"></span>
Run async inference jobs at scale, faster and cheaper
  <span class="card-end"></span>

  <span class="card-start" data-card-title="Browse 100+ Models" data-card-href="https://fireworks.ai/models"></span>
Explore all available models across modalities
  <span class="card-end"></span>

  <span class="card-start" data-card-title="API Reference" data-card-href="/api-reference/introduction"></span>
Complete API documentation
  <span class="card-end"></span>
<span class="card-group-end"></span>
Link last verified June 7, 2026. View original ↗
Source: Fireworks AI Docs
Link last verified: 2026-06-07