Minimize latencies ↗
noOriginal Documentation
Documentation Index#
Fetch the complete documentation index at: https://docs.pinecone.io/llms.txt Use this file to discover all available pages before exploring further.
There are many aspects to consider to minimize latencies:
Slow uploads or high latencies#
To minimize latency when accessing Pinecone:
- Switch to a cloud environment. For example: EC2, GCE, Google Colab, GCP AI Platform Notebook, or SageMaker Notebook. If you experience slow uploads or high query latencies, it might be because you are accessing Pinecone from your home network.
- Consider deploying your application in the same environment as your Pinecone service.
- See Decrease latency for more tips.
High query latencies with batching#
If you’re batching queries, try reducing the number of queries per call to 1 query vector. You can make these calls in parallel and expect roughly the same performance as with batching.
High latencies with fetch or include_values#
For on-demand indexes, since vector values are retrieved from object storage, operations that return vector values (fetch operations or queries with include_values=true) may have increased latency. If you don’t need the vector values, set include_values=false when querying, or use the query operation instead of fetch if you only need metadata or IDs. See Decrease latency for more details.