Frequently Asked Questions About Cohere ↗

cohere concept beginner cost-management models

Summary: Cohere is a powerful platform for using Large Language Models (LLMs). This page covers FAQs related to functionality, pricing, troubleshooting, and more.

Original Documentation

title: Frequently Asked Questions About Cohere slug: docs/cohere-faqs hidden: false description: >- Cohere is a powerful platform for using Large Language Models (LLMs). This page covers FAQs related to functionality, pricing, troubleshooting, and more. image: type: fileId value: ‘https://files.buildwithfern.com/cohere.docs.buildwithfern.com/346977d0e82f88b23da74f842e80460af9847cdcd7ac3c4eb2070df017b8824e/assets/images/e2602c8-meta_docs_image_cohere.jpg' keywords: ’natural language processing, generative AI, fine-tuning models’#

Here, we’ll walk through some common questions we get about how Cohere’s models work, what pricing options there are, and more!

Cohere Models#

Command R+ is most suitable for those workflows that lean on complex RAG functionality and multi-step tool use (agents). Command R, on the other hand, is great for simpler retrieval augmented generation (RAG) and single-step tool use tasks, as well as applications where price is a major consideration. We offer a full model overview in our [documentation](https://docs.cohere.com/docs/models). Aya specializes in human-like multilingual text generation and conversations, ideal for content creation and chatbots. Command R excels at understanding and executing instructions, enabling interactive applications and data-driven tasks.This makes it more suitable for many enterprise use cases.

You can check out this link to learn more about Aya models, datasets and related research papers.

Cohere’s Command models have strong performance across enterprise tasks such as summarization, multilingual use cases, and retrieval augmented generation. We also have the widest range of deployment options, you can check it [here](https://cohere.com/deployment-options). You can access Cohere’s models through our platform (cohere.com) or through various cloud platforms including, but not limited to, Sagemaker, Bedrock, Azure AI, and OCI Generatie AI. We also have private deployments. In terms of use case specific features, please reference the latest [API documentation](https://docs.cohere.com/reference/about) to learn more about the API features and [Cookbooks](https://docs.cohere.com/page/cookbooks) with starter code for various tasks to aid development. You can find our prompt engineering recommendations in the following resources:

To fine-tune models for tasks like data extraction, question answering, or content generation, it’s important to start by defining your goals and ensuring your data captures the task accurately.

For generative models, fine-tuning involves training on input-output pairs, where the model learns to generate specific outputs based on given inputs. This is ideal for tasks like customizing responses or enforcing a particular writing style.

For tasks like data extraction, fine-tuning helps the model identify relevant patterns and structure data as needed. High-quality, task-specific data is essential for achieving accurate results.

For more details, you can refer to Cohere’s fine-tuning guide for best practices.

Fine tuning is a powerful capability, but takes some effort to get right. You should first understand what you are trying to achieve and then determine if the data you are planning to train on effectively captures that task. The generative models specifically learn off of input/output pairs and therefore need to see examples of the expected input for your task and the ideal output. For more information, see our finetuning guide.

You can find the best practices for preparing and structuring fine-tuning data across these three modules. Data preparation for [chat fine-tuning](https://docs.cohere.com/docs/chat-preparing-the-data), [classify fine-tuning](https://docs.cohere.com/docs/classify-preparing-the-data) and [rerank fine-tuning](https://docs.cohere.com/docs/rerank-preparing-the-data). The primary file formats supported are jsonl and csv. On the generative side we support fine-tuning for Command R and Command R 082024. On the representation side, we support fine-tuning for Classify and Rerank models. You can learn more about it [in this section](https://docs.cohere.com/docs/fine-tuning) of our docs. For the latest current offerings, you should reference our [models page](https://docs.cohere.com/v1/docs/models). This largely depends on your use case. In general, Cohere has both generative and representation models. The [models page](https://docs.cohere.com/v1/docs/models) has more information on each of these, but use cases can often use a combination of models. Cohere models offer a wide range of capabilities, from advanced generative tasks to semantic search and other representation use cases. All of our models are multilingual and can support use cases from [RAG](https://docs.cohere.com/docs/retrieval-augmented-generation-rag) to [Tool Use](https://docs.cohere.com/docs/tools) and much more.

Our Command model family is our flagship series of generative models. These models excel at taking a user instruction (or command) and generating text following the instruction. They also have conversational capabilities which means that they are well-suited for chatbots and virtual assistants.

For representation tasks, we offer two key models:

Embed: Embed models generate embeddings from text, allowing for tasks like classification, clustering, and semantic search.
Rerank: Rerank models improve the output of search and ranking systems by re-organizing results according to specific parameters, improving the relevance and accuracy of search results.

Our models perform best when used end-to-end in their intended workflows. For a detailed breakdown of each model, including their latest versions, check our models page.

While this depends on the document structure itself, the best rule of thumb would be to split the PDF into its pages and then split each page into chunks that fit our context length.

From there, you should associate each chunk to a page and a doc id which will allow you to have various levels of granularity for retrieval.

You can find further guides on chunking strategies and handling PDFs with mixed data.

Cohere’s models offer multilingual capabilities out of the box. You can reference our example notebooks such as this [RAG one](https://docs.cohere.com/page/basic-rag) to get a better idea of how to piece these models together to build a question answering application. We are always looking to expand multilingual support to other languages. Command R/R+ have been exposed to other languages during training and we encourage you to try it on your use case. If you would like to provide feedback or suggestions on additional languages, please don't hesitate to contact [support@cohere.com](mailto:support@cohere.com). Cohere’s command models are optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic.

Additionally, pre-training data has been included for the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian.

You can find a full list of languages that are supported by Cohere’s multilingual embedding model here.

You can check the range of use cases based on our customer stories [here](https://cohere.com/use-cases).

Model Deployment#

You can find the updated cloud support listed in our [documentation](https://docs.cohere.com/v1/docs/cohere-works-everywhere). Check out links to our models on [AWS Bedrock](https://aws.amazon.com/bedrock/cohere-command-embed/), [AWS SageMaker](https://aws.amazon.com/marketplace/seller-profile?id=87af0c85-6cf9-4ed8-bee0-b40ce65167e0), [Azure AI](https://ai.azure.com/explore/models?selectedCollection=cohere), and [OCI Generative AI](https://www.oracle.com/artificial-intelligence/generative-ai/generative-ai-service/features/#models). We have the ability to deploy all of our models privately. To learn more, please reach out to the sales team [using this form](https://cohere.com/contact-sales). Please reach out to the sales team to learn more. To learn more, please reach out to the sales team [using this form](https://cohere.com/contact-sales). The default license for our open weights is for non-commercial use. For information about licensing please reach out to the sales team [using this form](https://cohere.com/contact-sales). Please check our deployment options [here](https://cohere.com/deployment-options) and contact our sales team [with this form](https://cohere.com/contact-sales) to learn more.

Platform & API#

We offer two kinds of API keys: trial keys (with a variety of attendant limitations), and production keys (which have no such limitations). You can learn about them in [this section](https://docs.cohere.com/docs/rate-limits) of our documentation. We make a distinction between “trial” and “production” usage of an API key.

Trial API key usage is free, but limited. You can test different applications or build proofs of concept using all of Cohere’s models and APIs with a trial key by simply signing up for a Cohere account here.

Please refer to [API Keys and Rate Limits section](https://docs.cohere.com/v1/docs/rate-limits) of our documentation. You can contact our support team at [support@cohere.com](mailto:support@cohere.com) and get help and share your feedback with our team and developer community via the [Cohere Discord server](https://discord.gg/co-mmunity).

Getting Started#

The Cohere API can be accessed through the SDK. We support SDKs in 4 different languages, Python, Typescript, Java, and Go.

Visit the API docs for further details.

Here are the relevant links:

Dashboard

You can find the resources as follows:

Model pages: Command, Embed, and Rerank.
For business
Cohere documentation

For learning, we recommend our LLM University hub resources, which have been prepared by Cohere experts. These include a number of very high-quality, step-by-step guides to help you start building quickly.

For building, we recommend checking out our Github Notebooks, as well as the Get Started and Cookbooks sections in our documentation.

You can access Command with tools using our [Chat](https://chat.cohere.com/) demo environment, [Developer Playground](https://dashboard.cohere.com/playground/chat), and [Chat API](https://docs.cohere.com/docs/chat-api). For general recommendations on prompt engineering check the following resources:

Prompt Engineering Basics Guide
Tips on Crafting Effective Prompts
Techniques of Advanced Prompt Engineering.

For the most reliable results when working with external document sources, we recommend using a technique called Retrieval-Augmented Generation (RAG). You can learn about it here:

You can find a list of comprehensive tutorials and code examples in our LLM University hub and the Cookbook guides.

Check out our [Cookbooks](https://docs.cohere.com/v1/page/cookbooks), which include step-by-step guides and project examples, and the [Cohere Discord server](https://discord.gg/co-mmunity) for inspiration from our developer community. LLMU can be accessed directly from the [Cohere website](https://cohere.com/llmu). We periodically add more content and highly recommend you follow us on our socials to stay up to date. You can find the documentation with the full Cohere model and feature overview [here](https://docs.cohere.com/).

Troubleshooting Errors#

Here are some common errors and potential solutions for dealing with errors related to API key limitations or missing artifacts.

API Key Limitations#

Cohere’s API keys have certain limitations and permissions associated with them. If you are encountering errors related to API key limitations, it could be due to the following reasons:

Rate Limits: Cohere’s API has rate limits in place to ensure fair usage. If you exceed the allowed number of requests within a specific time frame, you may receive an error. To resolve this, double check the rate limits for your API plan and ensure your usage is within the specified limits. You can also implement a rate-limiting mechanism in your code to control the frequency of API requests.
API Key Expiration: API keys may have an expiration date. If your key has expired, it will no longer work.Check the validity period of your API key and renew it if necessary. Contact Cohere’s support team if you need assistance with key renewal.

Missing Artifacts#

Cohere’s dataset creation process involves generating artifacts, which are essential components for training models. If you receive errors about missing artifacts, consider the following:

Incorrect Dataset Format: Ensure your dataset is in the correct format required by Cohere’s API. Different tasks (e.g., classification, generation) may have specific formatting requirements. Review the documentation for dataset formatting guidelines and ensure your data adheres to the specified structure.
File Upload Issues: Artifacts are generated after successfully uploading your dataset files. Issues with file uploads can lead to missing artifacts. Verify that your dataset files are accessible and not corrupted. You should also check file size limits to ensure your files meet the requirements.
Synchronization Delay: Sometimes, there might be a slight delay in generating artifacts after uploading the dataset. Wait for a few minutes and refresh the dataset status to see if the artifacts are generated.

General Troubleshooting Steps#

If your problem doesn’t fall into these buckets, here are a few other things you can try:

Check API Documentation: Review the Cohere API documentation for dataset creation to ensure you are following the correct steps and parameters.
Inspect API Responses: Carefully examine the error responses returned by the API. They often contain valuable information about the issue. Cohere uses conventional HTTP response codes to indicate the success or failure of an API request. In general:
- Codes in the 2xx range indicate success.
- Codes in the 4xx range indicate an error that failed given the information provided (e.g., a required parameter was omitted, a charge failed, etc.).
- Codes in the 5xx range indicate an error with Cohere’s servers (these are rare).

Review the Errors page to learn more about how to deal with non-2xx response code.

Reach Out to Cohere Support#

If the issue persists, contact Cohere’s support team. They can provide personalized assistance and help identify any specific problems with your API integration.

If you're encountering difficulties logging into your Cohere dashboard, there could be a few reasons.

First, check our status page at status.cohere.com to see if any known issues or maintenance activities might impact your access.

If the status page doesn’t indicate any ongoing issues, the next step would be to reach out to our support teams. They’re always ready to assist and can be contacted at support@cohere.com. Our support team will be able to investigate further and provide you with the necessary guidance to resolve the login issue.

We understand that login and authentication issues can be frustrating. Here are some steps you can take to troubleshoot and resolve these problems:

Check Your Credentials: Ensure you use the correct username and password. It’s easy to make a typo, so double-check your credentials before logging in again.
Clear Cache and Cookies: Sometimes, issues with logging in can be caused by cached data or cookies on your device. Try clearing your browser’s cache and cookies, then attempt to log in again.
Contact Support: If none of the above steps resolve the issue, it’s time to contact our support team. We are equipped to handle a wide range of login and authentication issues and can provide further assistance. You can contact us at support@cohere.com.

If you’re facing any technical challenges or need guidance, our support team is here to help. Contact us at support@cohere.com, and our technical support engineers will provide the necessary assistance and expertise to resolve your issues.

Billing, Pricing, Licensing, Account Management#

Please reach out to our support team at [support@cohere.com](mailto:support@cohere.com). When reaching out to the support team, please keep the following questions in mind:

What model are you referring to?
Copy paste the error message
- Please note that this is our error message information:
  - 400 - invalid combination of parameters
  - 422 - request is malformed (eg: unsupported enum value, unknown param)
  - 499 - request is canceled by the user
  - 401 - invalid api token (not relevant on AWS)
  - 404 - model not found (not relevant on AWS)
  - 429 - rate limit reached (not relevant on AWS)
What is the request seq length you are passing in?
What are the generation max tokens you are requesting?
Are all the requests of various input/output shapes failing?
Share any logs

Please refer to our dedicated pricing page for most up-to-date pricing.

Cohere offers two types of API keys: trial keys and production keys.

Trial Key Limitations

Trial keys are rate-limited depending on the endpoint you want to use. For example, the Embed endpoint is limited to 5 calls per minute, while the Chat endpoint is limited to 20 calls per minute. All other endpoints on trail keys are 1,000 calls per month. If you want to use Cohere endpoints in a production application or require higher throughput, you can upgrade to a production key.

Production Key Specifications

Production keys for all endpoints are rate-limited at 1,000 calls per minute, with unlimited monthly use and are intended for serving Cohere in a public-facing application and testing purposes. Usage of production keys is metered at price points which can be found on the Cohere pricing page.

To get a production key, you’ll need to be the admin of your organization or ask your organization’s admin to create one. Please visit your API Keys > Dashboard, where the process should take less than three minutes and will generate a production key that you can use to serve Cohere APIs in production.

Cohere offers a convenient way to keep track of your usage and billing information. All our endpoints provide this data as metadata for each conversation, which is directly accessible via the API. This ensures you can easily monitor your usage. Our Dashboard provides an additional layer of control for standard accounts. You can set a monthly spending limit to manage your expenses effectively. To learn more about this feature and how to enable it, please visit the Billing & Usage section on the Dashboard, specifically the [Spending Limit](https://dashboard.cohere.com/billing?tab=spending-limit) tab. If you need to make changes to your account or have specific requests, Cohere has a straightforward process. All the essential details about your account can be found under the [Dashboard](https://dashboard.cohere.com). This is a great starting point for any account-related queries.

However, if you have a request that requires further assistance or if the changes you wish to make are not covered by the Dashboard, our support team is here to help. Please feel free to reach out directly at support@cohere.com or ask your question in our Discord community.

Please reach out to our Sales team at [sales@cohere.com](mailto:sales@cohere.com) Cohere's API pricing is based on a simple and transparent token-based model. The cost of using the API is calculated based on the number of tokens consumed during the API calls.

Check our pricing page for more information.

Trial keys are rate-limited depending on the endpoint you want to use, and the monthly limit is 1000 calls per month.

Check our free trial documentation for more information.

Absolutely! Cohere's platform empowers businesses, including startups, to leverage our technology for production and commercial purposes.

In terms of usage guidelines, we’ve compiled a comprehensive set of resources to ensure a smooth and compliant experience. You can access these guidelines here.

We’re excited to support your business and its unique needs. If you have any further questions or require additional assistance, please don’t hesitate to reach out to our team at sales@cohere.com or support@cohere.com for more details.

You can access all the necessary tools and information through your account's dashboard [here](https://dashboard.cohere.com/team).

If you’re unable to find the specific feature or information regarding merging accounts, our support team is always eager to help.

Simply start a new chat with them using the chat bubble on our website or reach out via email to support@cohere.com.

The token limit for multiple documents in a single query can vary depending on the model or service you're using. For instance, our Chat Model has a long-context window of 128k tokens. This means that as long as the combined length of your input and output tokens stays within this limit, the number of documents you include in your query shouldn't be an issue.

It’s important to note that different models may have different token and document limits. To ensure you’re working within the appropriate parameters, we’ve provided detailed information about these limits for each model in this model overview section.

We understand that managing token limits can be a crucial aspect of your work, and we’re here to support you in navigating these considerations effectively. If you have any further questions or require additional assistance, please don’t hesitate to reach out to our team at support@cohere.com

Please find the pricing information about our model and services [here](https://cohere.com/pricing).

Should you have any further questions please feel free to reach out to our sales team at sales@cohere.com or support@cohere.com for more details.

Legal, Security, Data Privacy#

When you’re using Cohere models via our Platform, we segment your data using logical segmentation. When using Cohere models via a private or cloud deployment from one of our partners, your data is not shared with Cohere. We support our enterprise customers’ privacy and data security compliance needs by offering multiple deployment options so customers can control access to data and personal information under their control. Seamlessly complete your privacy and security compliance reviews by visiting Cohere’s [Trust Center](https://cohere-inc.secureframetrust.com/) where you can request a copy of our SOC 2 Type II Report, and review our privacy documentation and other compliance resources. When it comes to using AI models securely, two important areas stand out.

1. Model Security and Safety#

This responsibility lies primarily with the model provider, and at Cohere, we are deeply committed to ensuring responsible AI development. Our team includes some of the top experts in AI security and safety. We lead through various initiatives, including governance and compliance frameworks, safety and security protocols, strict data controls for model training, and industry thought leadership.

2. Secure Application Development with Cohere Models:#

While Cohere ensures the model’s security, customers are responsible for building and deploying applications using these models securely. A strong focus on a Secure Product Lifecycle is essential, and our models integrate seamlessly into this process. Core security principles remain as relevant in the AI space as elsewhere. For example, robust authentication protocols should exist for all users, services, and micro-services. Secrets, tokens, and credentials must be tightly controlled and regularly monitored.

Our recommendations:#

Implement responsible AI and governance policies in your AI development process, focusing on customer safety and security.
Continuously monitor the performance of your applications and promptly address any issues that arise.

We also regularly share insights and best practices on AI security on our blog. Here are a few examples: 1, 2, 3.

If there's anything not covered in this document, you're welcome to reach to us with [this form](https://forms.gle/Mwbn42rrv5vokwFg6).

Link last verified June 7, 2026. View original ↗

Source: Cohere Docs

Link last verified: 2026-02-26