Streaming API ↗
noOriginal Documentation
Documentation Index#
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt Use this file to discover all available pages before exploring further.
LangGraph SDK allows you to stream outputs from the LangSmith Deployment API.
LangGraph SDK and Agent Server are a part of LangSmith.
Basic usage#
Basic usage example:
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)
# Using the graph deployed with the name "agent"
assistant_id = "agent"
# create a thread
thread = await client.threads.create()
thread_id = thread["thread_id"]
# create a streaming run
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input=inputs,
stream_mode="updates"
):
print(chunk.data)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });
// Using the graph deployed with the name "agent"
const assistantID = "agent";
// create a thread
const thread = await client.threads.create();
const threadID = thread["thread_id"];
// create a streaming run
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input,
streamMode: "updates"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
Create a thread:
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads \
--header 'Content-Type: application/json' \
--data '{}'
```
Create a streaming run:
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--header 'x-api-key: <API_KEY>'
--data "{
\"assistant_id\": \"agent\",
\"input\": <inputs>,
\"stream_mode\": \"updates\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
<Accordion title="Extended example: streaming updates">
This is an example graph you can run in the Agent Server.
See [LangSmith quickstart](/langsmith/deployment-quickstart) for more details.
```python
# graph.py
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
topic: str
joke: str
def refine_topic(state: State):
return {"topic": state["topic"] + " and cats"}
def generate_joke(state: State):
return {"joke": f"This is a joke about {state['topic']}"}
graph = (
StateGraph(State)
.add_node(refine_topic)
.add_node(generate_joke)
.add_edge(START, "refine_topic")
.add_edge("refine_topic", "generate_joke")
.add_edge("generate_joke", END)
.compile()
)Once you have a running Agent Server, you can interact with it using LangGraph SDK
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>)
# Using the graph deployed with the name "agent"
assistant_id = "agent"
# create a thread
thread = await client.threads.create()
thread_id = thread["thread_id"]
# create a streaming run
async for chunk in client.runs.stream( # (1)!
thread_id,
assistant_id,
input={"topic": "ice cream"},
stream_mode="updates" # (2)!
):
print(chunk.data)
```
1. The `client.runs.stream()` method returns an iterator that yields streamed outputs.
2\. Set `stream_mode="updates"` to stream only the updates to the graph state after each node. Other stream modes are also available. See [supported stream modes](#supported-stream-modes) for details.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
// Using the graph deployed with the name "agent"
const assistantID = "agent";
// create a thread
const thread = await client.threads.create();
const threadID = thread["thread_id"];
// create a streaming run
const streamResponse = client.runs.stream( // (1)!
threadID,
assistantID,
{
input: { topic: "ice cream" },
streamMode: "updates" // (2)!
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
1. The `client.runs.stream()` method returns an iterator that yields streamed outputs.
2. Set `streamMode: "updates"` to stream only the updates to the graph state after each node. Other stream modes are also available. See [supported stream modes](#supported-stream-modes) for details.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
Create a thread:
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads \
--header 'Content-Type: application/json' \
--data '{}'
```
Create a streaming run:
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"topic\": \"ice cream\"},
\"stream_mode\": \"updates\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
```python
{'run_id': '1f02c2b3-3cef-68de-b720-eec2a4a8e920', 'attempt': 1}
{'refine_topic': {'topic': 'ice cream and cats'}}
{'generate_joke': {'joke': 'This is a joke about ice cream and cats'}}Supported stream modes#
| Mode | Description | LangGraph Library Method |
|---|---|---|
values | Stream the full graph state after each super-step. | .stream() / .astream() with stream_mode="values" |
updates | Streams the updates to the state after each step of the graph. If multiple updates are made in the same step (e.g., multiple nodes are run), those updates are streamed separately. | .stream() / .astream() with stream_mode="updates" |
messages-tuple | Streams LLM tokens and metadata for the graph node where the LLM is invoked (useful for chat apps). | .stream() / .astream() with stream_mode="messages" |
debug | Streams as much information as possible throughout the execution of the graph. | .stream() / .astream() with stream_mode="debug" |
custom | Streams custom data from inside your graph | .stream() / .astream() with stream_mode="custom" |
events | Stream all events (including the state of the graph); mainly useful when migrating large LCEL apps. | .astream_events() |
Stream multiple modes#
You can pass a list as the stream_mode parameter to stream multiple modes at once.
The streamed outputs will be tuples of (mode, chunk) where mode is the name of the stream mode and chunk is the data streamed by that mode.
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input=inputs,
stream_mode=["updates", "custom"]
):
print(chunk)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```js
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input,
streamMode: ["updates", "custom"]
}
);
for await (const chunk of streamResponse) {
console.log(chunk);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": <inputs>,
\"stream_mode\": [
\"updates\"
\"custom\"
]
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
## Stream graph state
Use the stream modes `updates` and `values` to stream the state of the graph as it executes.
* `updates` streams the **updates** to the state after each step of the graph.
* `values` streams the **full value** of the state after each step of the graph.
<Accordion title="Example graph">
```python
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
topic: str
joke: str
def refine_topic(state: State):
return {"topic": state["topic"] + " and cats"}
def generate_joke(state: State):
return {"joke": f"This is a joke about {state['topic']}"}
graph = (
StateGraph(State)
.add_node(refine_topic)
.add_node(generate_joke)
.add_edge(START, "refine_topic")
.add_edge("refine_topic", "generate_joke")
.add_edge("generate_joke", END)
.compile()
)Stateful runs Examples below assume that you want to persist the outputs of a streaming run in the checkpointer DB and have created a thread. To create a thread:
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>)
# Using the graph deployed with the name "agent"
assistant_id = "agent"
# create a thread
thread = await client.threads.create()
thread_id = thread["thread_id"]
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```js
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
// Using the graph deployed with the name "agent"
const assistantID = "agent";
// create a thread
const thread = await client.threads.create();
const threadID = thread["thread_id"]
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads \
--header 'Content-Type: application/json' \
--data '{}'
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
If you don't need to persist the outputs of a run, you can pass `None` instead of `thread_id` when streaming.
<span class="callout-end"></span>
### Stream mode: `updates`
Use this to stream only the **state updates** returned by the nodes after each step. The streamed outputs include the name of the node as well as the update.
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"topic": "ice cream"},
stream_mode="updates"
):
print(chunk.data)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { topic: "ice cream" },
streamMode: "updates"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"topic\": \"ice cream\"},
\"stream_mode\": \"updates\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
### Stream mode: `values`
Use this to stream the **full state** of the graph after each step.
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"topic": "ice cream"},
stream_mode="values"
):
print(chunk.data)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { topic: "ice cream" },
streamMode: "values"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"topic\": \"ice cream\"},
\"stream_mode\": \"values\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
## Subgraphs
To include outputs from [subgraphs](/oss/python/langgraph/use-subgraphs) in the streamed outputs, you can set `subgraphs=True` in the `.stream()` method of the parent graph. This will stream outputs from both the parent graph and any subgraphs.
```python
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"foo": "foo"},
stream_subgraphs=True, # (1)!
stream_mode="updates",
):
print(chunk)- Set
stream_subgraphs=Trueto stream outputs from subgraphs.
# graph.py
from langgraph.graph import START, StateGraph
from typing import TypedDict
# Define subgraph
class SubgraphState(TypedDict):
foo: str # note that this key is shared with the parent graph state
bar: str
def subgraph_node_1(state: SubgraphState):
return {"bar": "bar"}
def subgraph_node_2(state: SubgraphState):
return {"foo": state["foo"] + state["bar"]}
subgraph_builder = StateGraph(SubgraphState)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_node(subgraph_node_2)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph_builder.add_edge("subgraph_node_1", "subgraph_node_2")
subgraph = subgraph_builder.compile()
# Define parent graph
class ParentState(TypedDict):
foo: str
def node_1(state: ParentState):
return {"foo": "hi! " + state["foo"]}
builder = StateGraph(ParentState)
builder.add_node("node_1", node_1)
builder.add_node("node_2", subgraph)
builder.add_edge(START, "node_1")
builder.add_edge("node_1", "node_2")
graph = builder.compile()Once you have a running Agent Server, you can interact with it using LangGraph SDK
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>)
# Using the graph deployed with the name "agent"
assistant_id = "agent"
# create a thread
thread = await client.threads.create()
thread_id = thread["thread_id"]
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"foo": "foo"},
stream_subgraphs=True, # (1)!
stream_mode="updates",
):
print(chunk)
```
1. Set `stream_subgraphs=True` to stream outputs from subgraphs.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
// Using the graph deployed with the name "agent"
const assistantID = "agent";
// create a thread
const thread = await client.threads.create();
const threadID = thread["thread_id"];
// create a streaming run
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { foo: "foo" },
streamSubgraphs: true, // (1)!
streamMode: "updates"
}
);
for await (const chunk of streamResponse) {
console.log(chunk);
}
```
1. Set `streamSubgraphs: true` to stream outputs from subgraphs.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
Create a thread:
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads \
--header 'Content-Type: application/json' \
--data '{}'
```
Create a streaming run:
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"foo\": \"foo\"},
\"stream_subgraphs\": true,
\"stream_mode\": [
\"updates\"
]
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
**Note** that we are receiving not just the node updates, but we also the namespaces which tell us what graph (or subgraph) we are streaming from.
</Accordion>
<a id="debug" />
## Debugging
Use the `debug` streaming mode to stream as much information as possible throughout the execution of the graph. The streamed outputs include the name of the node as well as the full state.
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"topic": "ice cream"},
stream_mode="debug"
):
print(chunk.data)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { topic: "ice cream" },
streamMode: "debug"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"topic\": \"ice cream\"},
\"stream_mode\": \"debug\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
<a id="messages" />
## LLM tokens
Use the `messages-tuple` streaming mode to stream Large Language Model (LLM) outputs **token by token** from any part of your graph, including nodes, tools, subgraphs, or tasks.
The streamed output from [`messages-tuple` mode](#supported-stream-modes) is a tuple `(message_chunk, metadata)` where:
* `message_chunk`: the token or message segment from the LLM.
* `metadata`: a dictionary containing details about the graph node and LLM invocation.
<Accordion title="Example graph">
```python
from dataclasses import dataclass
from langchain.chat_models import init_chat_model
from langgraph.graph import StateGraph, START
@dataclass
class MyState:
topic: str
joke: str = ""
model = init_chat_model(model="gpt-4.1-mini")
def call_model(state: MyState):
"""Call the LLM to generate a joke about a topic"""
model_response = model.invoke( # (1)!
[
{"role": "user", "content": f"Generate a joke about {state.topic}"}
]
)
return {"joke": model_response.content}
graph = (
StateGraph(MyState)
.add_node(call_model)
.add_edge(START, "call_model")
.compile()
)- Note that the message events are emitted even when the LLM is run using
invokerather thanstream.
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"topic": "ice cream"},
stream_mode="messages-tuple",
):
if chunk.event != "messages":
continue
message_chunk, metadata = chunk.data # (1)!
if message_chunk["content"]:
print(message_chunk["content"], end="|", flush=True)
```
1. The "messages-tuple" stream mode returns an iterator of tuples `(message_chunk, metadata)` where `message_chunk` is the token streamed by the LLM and `metadata` is a dictionary with information about the graph node where the LLM was called and other information.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { topic: "ice cream" },
streamMode: "messages-tuple"
}
);
for await (const chunk of streamResponse) {
if (chunk.event !== "messages") {
continue;
}
console.log(chunk.data[0]["content"]); // (1)!
}
```
1. The "messages-tuple" stream mode returns an iterator of tuples `(message_chunk, metadata)` where `message_chunk` is the token streamed by the LLM and `metadata` is a dictionary with information about the graph node where the LLM was called and other information.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"topic\": \"ice cream\"},
\"stream_mode\": \"messages-tuple\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
### Filter LLM tokens
* To filter the streamed tokens by LLM invocation, you can [associate `tags` with LLM invocations](/oss/python/langgraph/streaming#filter-by-llm-invocation).
* To stream tokens only from specific nodes, use `stream_mode="messages"` and [filter the outputs by the `langgraph_node` field](/oss/python/langgraph/streaming#filter-by-node) in the streamed metadata.
## Stream custom data
To send **custom user-defined data**:
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"query": "example"},
stream_mode="custom"
):
print(chunk.data)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { query: "example" },
streamMode: "custom"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"query\": \"example\"},
\"stream_mode\": \"custom\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
## Stream events
To stream all events, including the state of the graph:
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
async for chunk in client.runs.stream(
thread_id,
assistant_id,
input={"topic": "ice cream"},
stream_mode="events"
):
print(chunk.data)
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
const streamResponse = client.runs.stream(
threadID,
assistantID,
{
input: { topic: "ice cream" },
streamMode: "events"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/stream \
--header 'Content-Type: application/json' \
--data "{
\"assistant_id\": \"agent\",
\"input\": {\"topic\": \"ice cream\"},
\"stream_mode\": \"events\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
## Stateless runs
If you don't want to **persist the outputs** of a streaming run in the [checkpointer](/oss/python/langgraph/persistence) DB, you can create a stateless run without creating a thread:
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)
async for chunk in client.runs.stream(
None, # (1)!
assistant_id,
input=inputs,
stream_mode="updates"
):
print(chunk.data)
```
1. We are passing `None` instead of a `thread_id` UUID.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });
// create a streaming run
const streamResponse = client.runs.stream(
null, // (1)!
assistantID,
{
input,
streamMode: "updates"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
```
1. We are passing `None` instead of a `thread_id` UUID.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request POST \
--url <DEPLOYMENT_URL>/runs/stream \
--header 'Content-Type: application/json' \
--header 'x-api-key: <API_KEY>'
--data "{
\"assistant_id\": \"agent\",
\"input\": <inputs>,
\"stream_mode\": \"updates\"
}"
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
## Join and stream
LangSmith allows you to join an active [background run](/langsmith/background-run) and stream outputs from it. To do so, you can use [LangGraph SDK's](/langsmith/langgraph-python-sdk) `client.runs.join_stream` method:
<span class="tab-group-start"></span>
<span class="tab-start" data-tab-title="Python"></span>
```python
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)
async for chunk in client.runs.join_stream(
thread_id,
run_id, # (1)!
):
print(chunk)
```
1. This is the `run_id` of an existing run you want to join.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="JavaScript"></span>
```javascript
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });
const streamResponse = client.runs.joinStream(
threadID,
runId // (1)!
);
for await (const chunk of streamResponse) {
console.log(chunk);
}
```
1. This is the `run_id` of an existing run you want to join.
<span class="tab-end"></span>
<span class="tab-start" data-tab-title="cURL"></span>
```bash
curl --request GET \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/<RUN_ID>/stream \
--header 'Content-Type: application/json' \
--header 'x-api-key: <API_KEY>'
```
<span class="tab-end"></span>
<span class="tab-group-end"></span>
<span class="callout-start" data-callout-type="warning"></span>
**Outputs not buffered**
When you use `.join_stream`, output is not buffered, so any output produced before joining will not be received.
<span class="callout-end"></span>
***
## API reference
For API usage and implementation, refer to the [API reference](https://docs.langchain.com/langsmith/server-api-ref#tag/thread-runs/POST/threads/\{thread_id}/runs/stream).
***
<span class="callout-start" data-callout-type="note"></span>
[Edit this page on GitHub](https://github.com/langchain-ai/docs/edit/main/src/langsmith/streaming.mdx) or [file an issue](https://github.com/langchain-ai/docs/issues/new/choose).
<span class="callout-end"></span>
<span class="callout-start" data-callout-type="note"></span>
[Connect these docs](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
<span class="callout-end"></span>