Chat Completions ↗

Summary: Create chat completions using the OpenAI-compatible endpoint

Original Documentation

Documentation Index#
Fetch the complete documentation index at: https://docs.wandb.ai/llms.txt Use this file to discover all available pages before exploring further.

Create chat completions using the OpenAI-compatible endpoint

Create a chat completion using the /chat/completions endpoint. This endpoint follows the OpenAI format for sending messages and receiving responses.

Requirements#

To create a chat completion, provide:

The Inference service base URL: https://api.inference.wandb.ai/v1
Your W&B API key: <your-api-key>
Optional: Your W&B team and project: <your-team>/<your-project>
A model ID from the available models

Request examples#

    import openai

    client = openai.OpenAI(
        # The custom base URL points to W&B Inference
        base_url='https://api.inference.wandb.ai/v1',

        # Create an API key at https://wandb.ai/settings
        # Consider setting it in the environment as OPENAI_API_KEY instead for safety
        api_key="<your-api-key>",

        # Optional: Team and project for usage tracking
        project="<your-team>/<your-project>",
    )

    # Replace <model-id> with any model ID from the available models list
    response = client.chat.completions.create(
        model="<model-id>",
        messages=[
            {"role": "system", "content": "<your-system-prompt>"},
            {"role": "user", "content": "<your-prompt>"}
        ],
    )

    print(response.choices[0].message.content)
    ```
  <span class="tab-end"></span>

  <span class="tab-start" data-tab-title="Bash"></span>
```bash
    curl https://api.inference.wandb.ai/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer <your-api-key>" \
      -H "OpenAI-Project: <your-team>/<your-project>" \
      -d '{
        "model": "<model-id>",
        "messages": [
          { "role": "system", "content": "You are a helpful assistant." },
          { "role": "user", "content": "Tell me a joke." }
        ]
      }'
    ```
  <span class="tab-end"></span>
<span class="tab-group-end"></span>

## Response format

The API returns responses in OpenAI-compatible format:

```json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "meta-llama/Llama-3.1-8B-Instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here's a joke for you..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 50,
    "total_tokens": 75
  }
}

Link last verified June 7, 2026. View original ↗

Source: Weights & Biases Docs

Link last verified: 2026-03-04