Streaming Vs Single Mode ↗
yesEditorial Notes
This guide addresses a fundamental architectural decision you will face early in any agent project. Single-turn mode is simpler to implement and debug because you get a complete response at once, making error handling straightforward and retry logic clean. Streaming mode provides real-time token-by-token output that is essential for user-facing applications where perceived latency matters, but it introduces complexity around partial results and connection management. Start with single-turn mode during development and prototyping, then migrate to streaming when you need production-quality user experience.
Original Documentation
Understanding the two input modes for Claude Agent SDK and when to use each
Overview#
The Claude Agent SDK supports two distinct input modes for interacting with agents:
- Streaming Input Mode (Default & Recommended) - A persistent, interactive session
- Single Message Input - One-shot queries that use session state and resuming
This guide explains the differences, benefits, and use cases for each mode to help you choose the right approach for your application.
Streaming Input Mode (Recommended)#
Streaming input mode is the preferred way to use the Claude Agent SDK. It provides full access to the agent’s capabilities and enables rich, interactive experiences.
It allows the agent to operate as a long lived process that takes in user input, handles interruptions, surfaces permission requests, and handles session management.
How It Works#
sequenceDiagram
participant App as Your Application
participant Agent as Claude Agent
participant Tools as Tools/Hooks
participant FS as Environment/<br/>File System
App->>Agent: Initialize with AsyncGenerator
activate Agent
App->>Agent: Yield Message 1
Agent->>Tools: Execute tools
Tools->>FS: Read files
FS-->>Tools: File contents
Tools->>FS: Write/Edit files
FS-->>Tools: Success/Error
Agent-->>App: Stream partial response
Agent-->>App: Stream more content...
Agent->>App: Complete Message 1
App->>Agent: Yield Message 2 + Image
Agent->>Tools: Process image & execute
Tools->>FS: Access filesystem
FS-->>Tools: Operation results
Agent-->>App: Stream response 2
App->>Agent: Queue Message 3
App->>Agent: Interrupt/Cancel
Agent->>App: Handle interruption
Note over App,Agent: Session stays alive
Note over Tools,FS: Persistent file system<br/>state maintained
deactivate AgentBenefits#
<span class=“card-start” data-card-raw=“title=“Image Uploads” icon=“image”"> Attach images directly to messages for visual analysis and understanding <span class=“card-start” data-card-raw=“title=“Queued Messages” icon=“stack”"> Send multiple messages that process sequentially, with ability to interrupt <span class=“card-start” data-card-raw=“title=“Tool Integration” icon=“wrench”"> Full access to all tools and custom MCP servers during the session <span class=“card-start” data-card-raw=“title=“Hooks Support” icon=“link”"> Use lifecycle hooks to customize behavior at various points <span class=“card-start” data-card-raw=“title=“Real-time Feedback” icon=“lightning”"> See responses as they’re generated, not just final results <span class=“card-start” data-card-raw=“title=“Context Persistence” icon=“database”"> Maintain conversation context across multiple turns naturally
Implementation Example#
async function* generateMessages() {
// First message
yield {
type: "user" as const,
message: {
role: "user" as const,
content: "Analyze this codebase for security issues"
}
};
// Wait for conditions or user input
await new Promise((resolve) => setTimeout(resolve, 2000));
// Follow-up with image
yield {
type: "user" as const,
message: {
role: "user" as const,
content: [
{
type: "text",
text: "Review this architecture diagram"
},
{
type: "image",
source: {
type: "base64",
media_type: "image/png",
data: await readFile("diagram.png", "base64")
}
}
]
}
};
}
// Process streaming responses
for await (const message of query({
prompt: generateMessages(),
options: {
maxTurns: 10,
allowedTools: ["Read", "Grep"]
}
})) {
if (message.type === "result") {
console.log(message.result);
}
}from claude_agent_sdk import (
ClaudeSDKClient,
ClaudeAgentOptions,
AssistantMessage,
TextBlock,
)
import asyncio
import base64
async def streaming_analysis():
async def message_generator():
# First message
yield {
"type": "user",
"message": {
"role": "user",
"content": "Analyze this codebase for security issues",
},
}
# Wait for conditions
await asyncio.sleep(2)
# Follow-up with image
with open("diagram.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
yield {
"type": "user",
"message": {
"role": "user",
"content": [
{"type": "text", "text": "Review this architecture diagram"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data,
},
},
],
},
}
# Use ClaudeSDKClient for streaming input
options = ClaudeAgentOptions(max_turns=10, allowed_tools=["Read", "Grep"])
async with ClaudeSDKClient(options) as client:
# Send streaming input
await client.query(message_generator())
# Process responses
async for message in client.receive_response():
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, TextBlock):
print(block.text)
asyncio.run(streaming_analysis())Single Message Input#
Single message input is simpler but more limited.
When to Use Single Message Input#
Use single message input when:
- You need a one-shot response
- You do not need image attachments, hooks, etc.
- You need to operate in a stateless environment, such as a lambda function
Limitations#
Single message input mode does not support:
- Direct image attachments in messages
- Dynamic message queueing
- Real-time interruption
- Hook integration
- Natural multi-turn conversations
Implementation Example#
// Simple one-shot query
for await (const message of query({
prompt: "Explain the authentication flow",
options: {
maxTurns: 1,
allowedTools: ["Read", "Grep"]
}
})) {
if (message.type === "result") {
console.log(message.result);
}
}
// Continue conversation with session management
for await (const message of query({
prompt: "Now explain the authorization process",
options: {
continue: true,
maxTurns: 1
}
})) {
if (message.type === "result") {
console.log(message.result);
}
}from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
import asyncio
async def single_message_example():
# Simple one-shot query using query() function
async for message in query(
prompt="Explain the authentication flow",
options=ClaudeAgentOptions(max_turns=1, allowed_tools=["Read", "Grep"]),
):
if isinstance(message, ResultMessage):
print(message.result)
# Continue conversation with session management
async for message in query(
prompt="Now explain the authorization process",
options=ClaudeAgentOptions(continue_conversation=True, max_turns=1),
):
if isinstance(message, ResultMessage):
print(message.result)
asyncio.run(single_message_example())