Use server-side caching ↗

langchain guide intermediate agents caching ide testing deployment

Summary: Cache values server-side in your agent deployment using stale-while-revalidate and key-value cache APIs.

Original Documentation

Documentation Index#
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt Use this file to discover all available pages before exploring further.

Cache values server-side in your agent deployment using stale-while-revalidate and key-value cache APIs.

Agent Server includes a built-in cache you can use inside your deployed graphs. Call swr with a key and a loader function, and the server caches the result, revalidates stale entries in the background, and returns fresh data on every read.

All cache APIs are server-side only and require the LangGraph Agent Server runtime. Values must be JSON-serializable.

swr requires Agent Server runtime v0.7.79 or later and is currently in beta. cache_get and cache_set require v0.7.29 or later.

Quick start#

Pass a key and an async loader function. swr returns the cached value if available, or calls your loader to fetch it:

from langgraph_sdk.cache import swr

result = await swr("config:global", load_config)
config_data = result.value

On the first call, swr awaits load_config() and caches the result. On subsequent calls, it returns the cached value instantly and revalidates in the background.

Configure freshness#

Control how long cached values are considered fresh and when they expire:

from datetime import timedelta
from langgraph_sdk.cache import swr

result = await swr(
    "config:global",
    load_config,
    fresh_for=timedelta(minutes=5),
    max_age=timedelta(hours=1),
)

Parameter	Default	Description
`fresh_for`	`timedelta(0)`	Duration to treat a cached value as fresh. During this window, `swr` returns the cached value with no revalidation.
`max_age`	`timedelta(days=1)`	Maximum lifetime of a cached entry. After this, `swr` blocks on the loader before returning. Capped at 1 day.

How revalidation works#

Cache state	Condition	Behavior
Miss	Key not in cache	Awaits `loader()`, stores result, returns it.
Fresh	`age < fresh_for`	Returns cached value, no revalidation.
Stale	`fresh_for <= age < max_age`	Returns cached value immediately, triggers background refresh.
Expired	`age >= max_age`	Awaits `loader()`, stores result, returns it.

Use with Pydantic models#

Pass a model parameter to automatically serialize and deserialize Pydantic models:

from pydantic import BaseModel
from langgraph_sdk.cache import swr

class UserProfile(BaseModel):
    name: str
    email: str
    role: str

result = await swr(
    f"profile:{user_id}",
    lambda: fetch_profile(user_id),
    model=UserProfile,
)
profile: UserProfile = result.value  # deserialized automatically

swr calls model_dump(mode="json") before storing and model.model_validate() when reading back.

Cache auth credentials#

You can cache credential validation in a custom auth handler to avoid hitting your identity provider on every request:

from datetime import timedelta
from langgraph_sdk import Auth
from langgraph_sdk.cache import swr

auth = Auth()

@auth.authenticate
async def authenticate(headers: dict) -> Auth.types.MinimalUserDict:
    token = (headers.get(b"authorization") or b"").decode()
    if not token:
        raise Auth.exceptions.HTTPException(status_code=401, detail="Missing token")

    result = await swr(
        f"auth:token:{token}",
        lambda: validate_and_fetch_user(token),
        fresh_for=timedelta(minutes=5),
        max_age=timedelta(hours=1),
    )
    return result.value

With this setup, the server returns the cached user for 5 minutes without revalidation, then revalidates in the background for up to 1 hour. After 1 hour, the next request blocks until validate_and_fetch_user completes.

Inspect cache status#

swr returns an SWRResult object with the value and cache status:

result = await swr("my-key", my_loader)

result.value   # the cached or freshly loaded value
result.status  # "miss" | "fresh" | "stale" | "expired"

Call .mutate() to update the cached value or force a revalidation:

await result.mutate(new_value)  # update the cache with a new value
await result.mutate()           # force revalidation by calling the loader

Low-level cache API#

For simple get/set caching without revalidation, use cache_get and cache_set directly:

from datetime import timedelta
from langgraph_sdk.cache import cache_get, cache_set

value = await cache_get("my-key")

if value is None:
    value = await expensive_computation()
    await cache_set("my-key", value, ttl=timedelta(hours=1))

`cache_get`#

async def cache_get(key: str) -> Any | None

Return the deserialized value, or None if the key does not exist or has expired.

`cache_set`#

async def cache_set(key: str, value: Any, *, ttl: timedelta | None = None) -> None

Parameter	Type	Default	Description
`key`	`str`	required	The cache key
`value`	`Any`	required	Value to cache. Must be JSON-serializable
`ttl`	`timedelta \| None`	`None`	Time-to-live. The server caps this at 1 day. `None` or zero defaults to 1 day

Next steps#

Add custom authentication to your deployment.
Add custom lifespan events to initialize resources at server startup.
Learn about the agent server architecture.

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Link last verified June 7, 2026. View original ↗

Source: LangChain Docs

Link last verified: 2026-04-05