Memory ↗

langchain guide intermediate agents memory workflows

Summary: Add persistent memory to agents built with Deep Agents so they learn and improve across conversations

Original Documentation

Documentation Index#
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt Use this file to discover all available pages before exploring further.

Add persistent memory to agents built with Deep Agents so they learn and improve across conversations

Memory lets your agent learn and improve across conversations. Deep Agents makes memory first class with filesystem-backed memory: the agent reads and writes memory as files, and you control where those files are stored using backends.

This page covers long-term memory: memory that persists across conversations. For short-term memory (conversation history and scratch files within a single session), see the context engineering guide. Short-term memory is managed automatically as part of the agent’s state.

How memory works#

Point the agent at memory files. Pass file paths to memory= when creating the agent. A backend controls where those files are stored and who can access them.
Agent loads memory on startup. At the start of each conversation, the agent reads its memory files into the system prompt.
Agent updates memory during the conversation. As the agent learns new information, it uses its built-in edit_file tool to update memory files. Changes are persisted and available in the next conversation.

The two most common patterns are agent-scoped memory (shared across all users) and user-scoped memory (isolated per user).

Agent-scoped memory#

Give the agent a shared memory file that all users read from and write to. The agent improves over time as it accumulates knowledge across every conversation. It can also learn and update skills when it has write access. See skills as procedural memory.

The key is the backend namespace: setting it to (assistant_id,) means every conversation for this agent reads and writes the same memory file.

from deepagents import create_deep_agent
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend
from langgraph.config import get_config

agent = create_deep_agent(
    memory=["/memories/AGENTS.md"],
    backend=CompositeBackend(
        default=StateBackend(),
        routes={
            "/memories/": StoreBackend(
                namespace=lambda ctx: (
                    get_config()["metadata"]["assistant_id"],
                ),
            ),
        },
    ),
)

User-scoped memory#

Give each user their own memory file. The agent remembers preferences, context, and history per user while core agent instructions stay fixed. Users can also have per-user skills if stored in a user-scoped backend.

The namespace uses (user_id,) so each user gets an isolated copy of the memory file. User A’s preferences never leak into User B’s conversations.

from deepagents import create_deep_agent
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend

agent = create_deep_agent(
    memory=["/memories/preferences.md"],
    backend=CompositeBackend(
        default=StateBackend(),
        routes={
            "/memories/": StoreBackend(
                namespace=lambda ctx: (ctx.runtime.context.user_id,),
            ),
        },
    ),
)

Advanced usage#

The sections above cover the basics: configure memory paths, choose a scope, and let the agent handle the rest. This section covers more advanced patterns.

Dimension	Question it answers	Options
Duration	How long does it last?	Short-term (single conversation) or long-term (across conversations)
Information type	What kind of information is it?	Episodic (past experiences), procedural (instructions and skills), or semantic (facts)
Scope	Who can see and modify it?	User, agent, or organization
Update strategy	When are memories written?	During conversation (default) or between conversations
Retrieval	How are memories read?	Loaded into prompt (default) or on demand (e.g., skills)
Agent permissions	Can the agent write to memory?	Read-write (default) or read-only (for shared policies)

Episodic memory#

Episodic memory stores records of past experiences: what happened, in what order, and what the outcome was. Unlike semantic memory (facts and preferences stored in files like AGENTS.md), episodic memory preserves the full conversational context so the agent can recall how a problem was solved, not just what was learned from it.

Deep Agents get episodic memory for free through checkpointers: every conversation is persisted as a checkpointed thread. To make past conversations searchable, wrap thread search in a tool. The user_id is pulled from the runtime context rather than passed as a parameter:

from langgraph_sdk import get_client
from langchain.tools import tool
from langgraph.config import get_config

client = get_client(url="<DEPLOYMENT_URL>")


@tool
async def search_past_conversations(query: str) -> str:
    """Search past conversations for relevant context."""
    config = get_config()
    user_id = config["metadata"]["user_id"]
    threads = await client.threads.search(
        metadata={"user_id": user_id},
        limit=5,
    )
    results = []
    for thread in threads:
        history = await client.threads.get_history(thread_id=thread["thread_id"])
        results.append(history)
    return str(results)

You can scope thread search by user or organization by adjusting the metadata filter:

# Search conversations for a specific user
threads = await client.threads.search(
    metadata={"user_id": user_id},
    limit=5,
)

# Search conversations across an organization
threads = await client.threads.search(
    metadata={"org_id": org_id},
    limit=5,
)

This is useful for agents that perform complex, multi-step tasks. For example, a coding agent can look back at a past debugging session and skip straight to the likely root cause.

Skills as procedural memory#

Skills are a form of procedural memory: reusable instructions that tell the agent how to perform a task. Unlike semantic memory (facts) or episodic memory (experiences), procedural memory encodes step-by-step capabilities the agent can apply on demand.

Skills can be either:

Read-only (developer-defined): The developer writes skills and the agent uses them but cannot modify them. This is the most common pattern.
Read-write (agent-learned): The agent creates and updates skills based on experience. When the agent has write access to memory, it can also write to its skills directory. Use policy hooks to control which paths are writable.

Skills are typically agent-scoped (shared across all users), but can also be user-scoped if stored in a user-namespaced backend.

In Deep Agents, pass skills via the skills= parameter. Skills are loaded on demand rather than injected into every prompt, keeping context lean until a capability is needed:

from deepagents import create_deep_agent

agent = create_deep_agent(
    memory=["/memories/AGENTS.md"],
    skills=["/skills/"],
    # ...backend configuration
)

For a complete guide on defining, organizing, and using skills, see the Skills documentation.

Organization-level memory#

Organization-level memory follows the same pattern as user-scoped memory, but with a shared namespace instead of a per-user one. Use it for policies or knowledge that should apply across all users and agents.

Organization memory is typically read-only to prevent prompt injection via shared state. See read-only vs writable memory for details.

from deepagents import create_deep_agent
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend

agent = create_deep_agent(
    memory=[
        "/memories/preferences.md",
        "/policies/compliance.md",
    ],
    backend=CompositeBackend(
        default=StateBackend(),
        routes={
            "/memories/": StoreBackend(
                namespace=lambda ctx: (ctx.runtime.context.user_id,),
            ),
            "/policies/": StoreBackend(
                namespace=lambda ctx: (),
            ),
        },
    ),
)

Populate organization memory from your application code:

from langgraph_sdk import get_client
from deepagents.backends.utils import create_file_data

client = get_client(url="<DEPLOYMENT_URL>")

await client.store.put_item(
    (),
    "/compliance.md",
    create_file_data("""## Compliance policies
- Never disclose internal pricing
- Always include disclaimers on financial advice
"""),
)

Use policy hooks to enforce that org-level memory is read-only.

Background consolidation#

By default, the agent writes memories during the conversation (hot path). An alternative is to process memories between conversations as a background task, sometimes called sleep time compute. A separate deep agent reviews recent conversations, extracts key facts, and merges them with existing memories.

Approach	Pros	Cons
Hot path (during conversation)	Memories available immediately, transparent to user	Adds latency, agent must multitask
Background (between conversations)	No user-facing latency, can synthesize across multiple conversations	Memories not available until next conversation, requires a second agent

For most applications, the hot path is sufficient. Add background consolidation when you need to reduce latency or improve memory quality across many conversations.

All three approaches deploy a consolidation agent alongside your main agent: a deep agent that reads conversation history, extracts key facts, and merges them into the memory store.

Trigger	When it runs	Best for
Cron	Fixed schedule (e.g., every 6 hours)	Batching across many conversations, synthesizing trends
Scheduled run	After a delay on the same thread	Simple setups where the client controls when consolidation fires

All three approaches use the same consolidation agent and deployment configuration. Define these once, then choose a trigger.

Consolidation agent#

The consolidation agent reads recent conversation history and merges key facts into the memory store. Register it alongside your main agent in langgraph.json:

from datetime import datetime, timedelta, timezone

from deepagents import create_deep_agent
from langchain.tools import tool
from langgraph.config import get_config
from langgraph_sdk import get_client

sdk_client = get_client(url="<DEPLOYMENT_URL>")


@tool
async def search_recent_conversations(query: str) -> str:
    """Search this user's conversations updated in the last 6 hours."""
    config = get_config()
    user_id = config["configurable"]["langgraph_auth_user_id"]

    since = datetime.now(timezone.utc) - timedelta(hours=6)
    threads = await sdk_client.threads.search(
        metadata={"user_id": user_id},
        updated_after=since.isoformat(),
        limit=20,
    )
    conversations = []
    for thread in threads:
        history = await sdk_client.threads.get_history(
            thread_id=thread["thread_id"]
        )
        conversations.append(history["values"]["messages"])
    return str(conversations)


agent = create_deep_agent(
    model="claude-sonnet-4-6",
    system_prompt="""Review recent conversations and update the user's memory file.
Merge new facts, remove outdated information, and keep it concise.""",
    tools=[search_recent_conversations],
)

{
  "dependencies": ["."],
  "graphs": {
    "agent": "./agent.py:agent",
    "consolidation_agent": "./consolidation_agent.py:agent"
  },
  "env": ".env"
}

Cron#

A cron job runs the consolidation agent on a fixed schedule. The agent searches all recent conversations and synthesizes them into memory.

graph LR
    Store[(Memory store)] -.->|reads| Conv1[Conversation 1]
    Store -.->|reads| Conv2[Conversation 2]
    Cron[Cron schedule] -->|periodic| Agent[Consolidation agent]
    Agent -->|writes| Store

    classDef trigger fill:#DCFCE7,stroke:#16A34A,stroke-width:2px,color:#14532D
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:2px,color:#1E3A8A
    classDef output fill:#F3E8FF,stroke:#9333EA,stroke-width:2px,color:#581C87
    classDef schedule fill:#FEF3C7,stroke:#D97706,stroke-width:2px,color:#92400E

    class Conv1,Conv2 trigger
    class Agent process
    class Store output
    class Cron schedule

Schedule the consolidation agent with a cron job:

from langgraph_sdk import get_client

client = get_client(url="<DEPLOYMENT_URL>")

cron_job = await client.crons.create(
    assistant_id="consolidation_agent",
    schedule="0 */6 * * *",
    input={"messages": [{"role": "user", "content": "Consolidate recent memories."}]},
)

All cron schedules are interpreted in UTC. See cron jobs for details on managing and deleting cron jobs.

Scheduled run#

Schedule a run after each conversation using the after_seconds parameter. This is activity-driven: consolidation only fires when a user has been active. Pass the same thread_id so the consolidation agent has access to the conversation context.

graph LR
    Store[(Memory store)] -.->|reads| Conv[Conversation]
    Conv -->|after_seconds| Agent[Consolidation agent]
    Agent -->|writes| Store

    classDef trigger fill:#DCFCE7,stroke:#16A34A,stroke-width:2px,color:#14532D
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:2px,color:#1E3A8A
    classDef output fill:#F3E8FF,stroke:#9333EA,stroke-width:2px,color:#581C87

    class Conv trigger
    class Agent process
    class Store output

The delay gives the conversation time to finish (the user may send follow-up messages) so the consolidation agent sees the full exchange.

If the user sends another message during the delay, the new run’s multitask_strategy controls what happens to the pending consolidation run. Use rollback on the new run to delete the stale consolidation run, then schedule a fresh one after the next response:

from langgraph_sdk import get_client

client = get_client(url="<DEPLOYMENT_URL>")

# Schedule consolidation 30 minutes after the conversation
await client.runs.create(
    thread_id=thread["thread_id"],
    assistant_id="consolidation_agent",
    input={"messages": [{"role": "user", "content": "Consolidate recent memories."}]},
    after_seconds=1800,
)

# If the user sends another message, use rollback on the new run
# to delete the pending consolidation run before starting
await client.runs.stream(
    thread_id=thread["thread_id"],
    assistant_id="agent",
    input={"messages": [{"role": "user", "content": new_message}]},
    multitask_strategy="rollback",
)

See double texting for details on how multitask_strategy controls the behavior of concurrent runs on the same thread.

For more on deploying agents with background processes, see going to production.

Read-only vs writable memory#

By default, the agent can both read and write memory files. For shared state like organization policies or compliance rules, you may want to make memory read-only so the agent can reference it but not modify it. This prevents prompt injection via shared memory and ensures that only your application code controls what’s in the file.

Permission	Use case	How it works
Read-write (default)	User preferences, agent self-improvement, learned skills	Agent updates files via `edit_file` tool
Read-only	Organization policies, compliance rules, shared knowledge bases, developer-defined skills	Populate via application code or the Store API. Use policy hooks to block agent writes.

Security considerations: If one user can write to memory that another user reads, a malicious user could inject instructions into shared state. To mitigate this:

Default to user scope (user_id) unless you have a specific reason to share
Use read-only memory for shared policies (populate via application code, not the agent)
Add human-in-the-loop validation before the agent writes to shared memory. Use an interrupt to require human approval for writes to sensitive paths.

To enforce read-only memory, use policy hooks on the backend to reject write operations to specific paths.

Concurrent writes#

Multiple threads can write to memory in parallel, but concurrent writes to the same file can cause last-write-wins conflicts. For user-scoped memory this is rare since users typically have one active conversation at a time. For agent-scoped or organization-scoped memory, consider using background consolidation to serialize writes, or structure memory as separate files per topic to reduce contention.

In practice, if a write fails due to a conflict, the LLM is usually smart enough to retry or recover gracefully, so a single lost write is not catastrophic.

Multiple agents in the same deployment#

To give each agent its own memory in a shared deployment, add assistant_id to the namespace:

StoreBackend(
    namespace=lambda ctx: (
        get_config()["metadata"]["assistant_id"],
        ctx.runtime.context.user_id,
    ),
)

Use (assistant_id,) alone if you only need per-agent isolation without per-user scoping.

Use LangSmith tracing to audit what your agent writes to memory. Every file write appears as a tool call in the trace.

Edit this page on GitHub or file an issue.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Link last verified June 7, 2026. View original ↗

Source: LangChain Docs

Link last verified: 2026-04-05