Web Sync

no
Summary: Crawl and sync website content into Chroma Cloud.

Original Documentation

Documentation Index#

Fetch the complete documentation index at: https://docs.trychroma.com/llms.txt Use this file to discover all available pages before exploring further.

Crawl and sync website content into Chroma Cloud.

Web Sync allows you to easily sync content from any publicly accessible website into your Chroma Cloud database. Given a starting URL, Sync will crawl the website and its links up to a specified depth, extracting the content as Markdown, chunking it, and inserting it into your Chroma database with embeddings.

Walkthrough#

If you do not already have a Chroma Cloud account, you will need to create one at trychroma.com. After creating an account, you can create a database by specifying a name:

Create database screen

Then, select the Web source during onboarding:

Onboarding screen

Next, configure the Web source by providing a starting URL:

Web source config

Optionally, you can configure other parameters like the page limit and include path regexes. Here, we’re scraping a maximum of 50 pages under https://docs.trychroma.com/cloud (all our cloud docs):

Web source config

You can also change the default collection name if you want. After clicking “Create Sync Source”, an initial sync will start:

Web sync in progress

After it finishes, you’ll be redirected to the created collection.

Link last verified June 7, 2026. View original ↗
Source: Chroma Docs
Link last verified: 2026-03-04