<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Vision on AI Knowledge Base</title><link>https://learn-ai.blindshot.kz/topics/vision/</link><description>Recent content in Vision on AI Knowledge Base</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://learn-ai.blindshot.kz/topics/vision/index.xml" rel="self" type="application/rss+xml"/><item><title>Vision &amp; Multimodal AI</title><link>https://learn-ai.blindshot.kz/paths/vision-multimodal/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/paths/vision-multimodal/</guid><description>&lt;p&gt;Build applications that understand and generate images, documents, and audio. This path covers vision capabilities across 5 providers, document processing, image generation, multimodal embeddings, and audio — the complete multimodal toolkit.&lt;/p&gt;
&lt;p&gt;The key cross-provider insight: each provider has different vision strengths. OpenAI offers the broadest multimodal coverage (vision + generation + audio), Anthropic excels at document understanding, Cohere provides multimodal embeddings for search, and Mistral/Together AI offer cost-effective open-source alternatives. Choosing the right provider per modality can dramatically improve both quality and cost.&lt;/p&gt;</description></item><item><title>Vision</title><link>https://learn-ai.blindshot.kz/docs/anthropic/platform/build-with-claude/vision/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/anthropic/platform/build-with-claude/vision/</guid><description>Send images to Claude for analysis, OCR, diagram interpretation, and multimodal reasoning.</description></item><item><title>Agentic RAG for PDFs with mixed data</title><link>https://learn-ai.blindshot.kz/docs/cohere/page/agentic-rag-mixed-data/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/page/agentic-rag-mixed-data/</guid><description>This page describes building a powerful, multi-step chatbot with Cohere&amp;rsquo;s models.</description></item><item><title>Arxiv Paper Tool</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/search-research/arxivpapertool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/search-research/arxivpapertool/</guid><description>The &amp;lsquo;ArxivPaperTool&amp;rsquo; searches arXiv for papers matching a query and optionally downloads PDFs.</description></item><item><title>Aya Vision</title><link>https://learn-ai.blindshot.kz/docs/cohere/docs/aya-vision/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/docs/aya-vision/</guid><description>Understand Cohere Labs groundbreaking multilingual model Aya Vision, a state-of-the-art multimodal language model excelling at multiple tasks.</description></item><item><title>Basic OCR</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/capabilities/document_ai/basic_ocr/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/capabilities/document_ai/basic_ocr/</guid><description>Extract text and structured content from PDFs and images with Mistral&amp;rsquo;s Document AI OCR processor</description></item><item><title>Build a content builder agent</title><link>https://learn-ai.blindshot.kz/docs/langchain/oss/javascript/deepagents/content-builder/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/oss/javascript/deepagents/content-builder/</guid><description>Build a content writing agent with brand memory, skills, subagents, and image generation</description></item><item><title>Build a content builder agent</title><link>https://learn-ai.blindshot.kz/docs/langchain/oss/python/deepagents/content-builder/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/oss/python/deepagents/content-builder/</guid><description>Build a content writing agent with brand memory, skills, subagents, and image generation</description></item><item><title>Clone and export reports</title><link>https://learn-ai.blindshot.kz/docs/wandb/models/reports/clone-and-export-reports/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/wandb/models/reports/clone-and-export-reports/</guid><description>Export a W&amp;amp;B Report as a PDF or LaTeX.</description></item><item><title>Cohere's Command A Vision Model</title><link>https://learn-ai.blindshot.kz/docs/cohere/docs/command-a-vision/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/docs/command-a-vision/</guid><description>Command A Vision is a powerful visual language model capable of interacting with image inputs. This document contains information about its capabilities.</description></item><item><title>Computer use</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/tools-computer-use/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/tools-computer-use/</guid><description>Enable models to interact with computer interfaces — clicking, typing, and navigating applications via screenshots, creating agents that can operate any software.</description></item><item><title>Connectors Overview</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/agents/connectors/connectors_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/agents/connectors/connectors_overview/</guid><description>Connectors enable Agents and users to access tools like websearch, code interpreter, image generation, and document library on demand</description></item><item><title>DALL-E Tool</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/ai-ml/dalletool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/ai-ml/dalletool/</guid><description>The &amp;lsquo;DallETool&amp;rsquo; is a powerful tool designed for generating images from textual descriptions.</description></item><item><title>Dedicated Read Nodes</title><link>https://learn-ai.blindshot.kz/docs/pinecone/guides/index-data/dedicated-read-nodes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/pinecone/guides/index-data/dedicated-read-nodes/</guid><description>Dedicated read nodes use provisioned hardware for read operations, providing predictable, low-latency performance at high query volumes.</description></item><item><title>Deploying Models in Private Environments</title><link>https://learn-ai.blindshot.kz/docs/cohere/docs/single-container-on-private-clouds/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/docs/single-container-on-private-clouds/</guid><description>Learn how to pull and test Cohere&amp;rsquo;s container images using a license with Docker and Kubernetes.</description></item><item><title>Developer quickstart</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/quickstart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/quickstart/</guid><description>Learn how to use the OpenAI API to generate human-like responses to natural language prompts, analyze images with computer vision, use powerful built-in tools, and more.</description></item><item><title>Files</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/concepts/files/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/concepts/files/</guid><description>Pass images, PDFs, audio, video, and text files to your agents for multimodal processing.</description></item><item><title>How to build a real-time image generator with Flux and Together AI</title><link>https://learn-ai.blindshot.kz/docs/together-ai/external-link-02/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/external-link-02/</guid><description/></item><item><title>How To Build An Open Source NotebookLM: PDF To Podcast</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/open-notebooklm-pdf-to-podcast/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/open-notebooklm-pdf-to-podcast/</guid><description>In this guide we will see how to create a podcast like the one below from a PDF input!</description></item><item><title>Image generation</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/image-generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/image-generation/</guid><description>Learn how to generate or edit images with the OpenAI API and image generation models.</description></item><item><title>Image generation</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/tools-image-generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/tools-image-generation/</guid><description>Generate and edit images using DALL-E and GPT-4o&amp;rsquo;s built-in image generation capabilities as tools within the Responses API.</description></item><item><title>Image Generation</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/agents/connectors/image_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/agents/connectors/image_generation/</guid><description>Built-in tool for agents to generate images on demand with detailed output handling and download options</description></item><item><title>Image Generation</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/images-overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/images-overview/</guid><description>Generate high-quality images from text + image prompts.</description></item><item><title>Image Generation Prompt iteration</title><link>https://learn-ai.blindshot.kz/docs/dspy/tutorials/image_generation_prompting/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/dspy/tutorials/image_generation_prompting/_overview/</guid><description/></item><item><title>Image Generation with DALL-E</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/learn/dalle-image-generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/learn/dalle-image-generation/</guid><description>Learn how to use DALL-E for AI-powered image generation in your CrewAI projects</description></item><item><title>Image Generation with Flux2</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/dedicated_containers_image/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/dedicated_containers_image/</guid><description>Deploy a Flux2 image generation model on Together&amp;rsquo;s managed GPU infrastructure using Dedicated Containers.</description></item><item><title>Image, Audio, Video &amp;amp; Document Input</title><link>https://learn-ai.blindshot.kz/docs/pydantic-ai/input/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/pydantic-ai/input/_overview/</guid><description/></item><item><title>Images and vision</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/images-vision/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/images-vision/</guid><description>Learn how to understand or generate images with the OpenAI API.</description></item><item><title>Include multimodal content in a prompt</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/multimodal-content/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/multimodal-content/</guid><description/></item><item><title>Introduction to Aya Vision</title><link>https://learn-ai.blindshot.kz/docs/cohere/page/aya-vision-intro/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/page/aya-vision-intro/</guid><description>In this notebook, we will explore the capabilities of Aya Vision, which can take text and image inputs to generates text responses.</description></item><item><title>Log media</title><link>https://learn-ai.blindshot.kz/docs/wandb/weave/guides/core-types/media/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/wandb/weave/guides/core-types/media/</guid><description>Log media returned in your traces, such as images and videos.</description></item><item><title>Log multimodal traces</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/log-multimodal-traces/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/log-multimodal-traces/</guid><description/></item><item><title>Mirror images for your LangSmith installation</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/self-host-mirroring-images/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/self-host-mirroring-images/</guid><description/></item><item><title>Models Benchmarks</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/getting-started/models/benchmark/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/getting-started/models/benchmark/</guid><description>Mistral&amp;rsquo;s benchmarked models excel in reasoning, multilingual tasks, coding, and multimodal capabilities, outperforming competitors in key benchmarks</description></item><item><title>Models Overview</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/getting-started/models/overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/getting-started/models/overview/</guid><description>Mistral offers open and premier models for various tasks, including text, code, audio, and multimodal processing</description></item><item><title>Multi-modal Messages</title><link>https://learn-ai.blindshot.kz/docs/ag-ui/drafts/multimodal-messages/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/ag-ui/drafts/multimodal-messages/</guid><description>Support for multimodal input messages including text, images, audio, video, and documents</description></item><item><title>Multimodal</title><link>https://learn-ai.blindshot.kz/docs/instructor/concepts/multimodal/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/instructor/concepts/multimodal/_overview/</guid><description/></item><item><title>Multimodal context for assistants</title><link>https://learn-ai.blindshot.kz/docs/pinecone/guides/assistant/multimodal/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/pinecone/guides/assistant/multimodal/</guid><description>Process images and charts in PDFs with multimodal assistants.</description></item><item><title>Multimodal Embeddings</title><link>https://learn-ai.blindshot.kz/docs/chroma/docs/embeddings/multimodal/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/chroma/docs/embeddings/multimodal/</guid><description>Learn how to work with multimodal data in Chroma collections.</description></item><item><title>Multimodal Inputs</title><link>https://learn-ai.blindshot.kz/docs/ag-ui/sdk/js/core/multimodal-inputs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/ag-ui/sdk/js/core/multimodal-inputs/</guid><description>Use modality-specific user input parts with typed data/url sources in @ag-ui/core</description></item><item><title>Multimodal Inputs</title><link>https://learn-ai.blindshot.kz/docs/ag-ui/sdk/python/core/multimodal-inputs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/ag-ui/sdk/python/core/multimodal-inputs/</guid><description>Use modality-specific user input parts with typed data/url sources in ag_ui.core</description></item><item><title>Multimodal Metrics Image Coherence</title><link>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-coherence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-coherence/</guid><description/></item><item><title>Multimodal Metrics Image Editing</title><link>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-editing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-editing/</guid><description/></item><item><title>Multimodal Metrics Image Helpfulness</title><link>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-helpfulness/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-helpfulness/</guid><description/></item><item><title>Multimodal Metrics Image Reference</title><link>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-reference/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-image-reference/</guid><description/></item><item><title>Multimodal Metrics Text To Image</title><link>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-text-to-image/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/deepeval/docs/multimodal-metrics-text-to-image/</guid><description/></item><item><title>OCR Tool</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/file-document/ocrtool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/file-document/ocrtool/</guid><description>The &amp;lsquo;OCRTool&amp;rsquo; extracts text from local images or image URLs using an LLM with vision.</description></item><item><title>OpenAI CLI</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/libraries/openai-cli/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/libraries/openai-cli/</guid><description>Install and use the generated openai command-line tool for Responses, structured outputs, images, speech, and shell workflows.</description></item><item><title>Overview</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/ai-ml/overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/ai-ml/overview/</guid><description>Leverage AI services, generate images, process vision, and build intelligent systems</description></item><item><title>Part 5. Audio, Images, and Video</title><link>https://learn-ai.blindshot.kz/docs/google/adk/streaming/dev-guide/part5/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/google/adk/streaming/dev-guide/part5/_overview/</guid><description/></item><item><title>PDF Extractor with Native Multi Step Tool Use</title><link>https://learn-ai.blindshot.kz/docs/cohere/page/pdf-extractor/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/page/pdf-extractor/</guid><description>This page describes how to create an AI agent able to extract information from PDFs.</description></item><item><title>PDF RAG Search</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/file-document/pdfsearchtool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/file-document/pdfsearchtool/</guid><description>The &amp;lsquo;PDFSearchTool&amp;rsquo; is designed to search PDF files and return the most relevant results.</description></item><item><title>Pdf Support</title><link>https://learn-ai.blindshot.kz/docs/anthropic/platform/build-with-claude/pdf-support/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/anthropic/platform/build-with-claude/pdf-support/</guid><description>&lt;p&gt;This is the reference for sending PDFs to Claude, and it matters because Claude processes both the extracted text and the rendered page images, letting it reason over tables, charts, and scanned layouts that plain text extraction would lose. Pay close attention to the page and size limits and to token accounting, since each page consumes both text and image tokens and costs add up fast. A common pitfall is assuming PDFs are as cheap as text. Compared with Mistral&amp;rsquo;s and OpenAI&amp;rsquo;s image inputs the differentiator is native multi-page document handling; read the token-counting page alongside this to estimate cost.&lt;/p&gt;</description></item><item><title>PDF Text Writing Tool</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/file-document/pdf-text-writing-tool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/file-document/pdf-text-writing-tool/</guid><description>The &amp;lsquo;PDFTextWritingTool&amp;rsquo; writes text to specific positions in a PDF, supporting custom fonts.</description></item><item><title>Playground</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/inference-web-interface/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/inference-web-interface/</guid><description>Guide to using Together AI&amp;rsquo;s web playground for interactive AI model inference across chat, image, video, audio, and transcribe models.</description></item><item><title>Quickstart: Flux Kontext</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/quickstart-flux-kontext/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/quickstart-flux-kontext/</guid><description>Learn how to use Flux&amp;rsquo;s new in-context image generation models</description></item><item><title>Quickstart: FLUX.2</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/quickstart-flux/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/quickstart-flux/</guid><description>Learn how to use FLUX.2, the next generation image model with advanced prompting capabilities</description></item><item><title>Quickstart: How to do OCR</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/quickstart-how-to-do-ocr/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/quickstart-how-to-do-ocr/</guid><description>A step by step guide on how to do OCR with Together AI&amp;rsquo;s vision models with structured outputs</description></item><item><title>Realtime API</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/realtime/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/realtime/</guid><description>Learn how to build low-latency, multimodal LLM applications with the Realtime API.</description></item><item><title>Run an evaluation with multimodal content</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/evaluate-with-attachments/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/evaluate-with-attachments/</guid><description/></item><item><title>Sandbox templates</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/sandbox-templates/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/sandbox-templates/</guid><description>Define container images, resource limits, and configuration for sandboxes using templates.</description></item><item><title>Sandbox warm pools</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/sandbox-warm-pools/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/sandbox-warm-pools/</guid><description>Pre-provision sandboxes for faster execution with automatic replenishment.</description></item><item><title>Serverless Pricing</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/serverless/pricing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/serverless/pricing/</guid><description>Per-token serverless pricing for text, vision, and embedding models, including Priority and Fast serving paths</description></item><item><title>Supervised Fine Tuning - Vision</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/fine-tuning/fine-tuning-vlm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/fine-tuning/fine-tuning-vlm/</guid><description>Learn how to fine-tune vision-language models on Fireworks AI with image and text datasets</description></item><item><title>Text &amp; Vision Fine-tuning</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/capabilities/finetuning/text-vision-finetuning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/capabilities/finetuning/text-vision-finetuning/</guid><description>Fine-tune Mistral&amp;rsquo;s text and vision models with custom datasets in JSONL format for domain-specific or conversational improvements</description></item><item><title>Together AI Skills</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/agent-skills/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/agent-skills/</guid><description>Give your AI coding agent deep knowledge of the Together AI platform with ready-made skills for inference, training, images, video, audio, and infrastructure.</description></item><item><title>Trace and Evaluate a Computer Vision Pipeline with Weave</title><link>https://learn-ai.blindshot.kz/docs/wandb/weave/cookbooks/ocr-pipeline/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/wandb/weave/cookbooks/ocr-pipeline/</guid><description>Learn how to use trace and evaluate a computer vision pipeline with weave with W&amp;amp;B Weave</description></item><item><title>Unlocking the Power of Multimodal Embeddings</title><link>https://learn-ai.blindshot.kz/docs/cohere/docs/multimodal-embeddings/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/docs/multimodal-embeddings/</guid><description>Multimodal embeddings convert text and images into embeddings for search and classification (API v2).</description></item><item><title>Using Cohere's Models to Work with Image Inputs</title><link>https://learn-ai.blindshot.kz/docs/cohere/docs/image-inputs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/cohere/docs/image-inputs/</guid><description>This page describes how a Cohere large language model works with image inputs. It covers passing images with the API, limitations, and best practices.</description></item><item><title>Using Multimodal Agents</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/learn/multimodal-agents/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/learn/multimodal-agents/</guid><description>Learn how to enable and use multimodal capabilities in your agents for processing images and other non-text content within the CrewAI framework.</description></item><item><title>Video &amp; Audio Inputs</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/guides/video-audio-inputs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/guides/video-audio-inputs/</guid><description>Query multimodal models to process video and audio content directly</description></item><item><title>Video Generation</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/videos-overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/videos-overview/</guid><description>Generate high-quality videos from text and image prompts.</description></item><item><title>Vision</title><link>https://learn-ai.blindshot.kz/docs/mistral/docs/capabilities/vision/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/mistral/docs/capabilities/vision/</guid><description>Multimodal AI models analyze images and text for insights, supporting use cases like OCR, chart understanding, and receipt transcription</description></item><item><title>Vision fine-tuning</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/vision-fine-tuning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/vision-fine-tuning/</guid><description>Fine-tune models for better image understanding.</description></item><item><title>Vision Inputs</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/fine-tuning/training-api/vision-inputs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/fine-tuning/training-api/vision-inputs/</guid><description>Fine-tune vision-language models (VLMs) with the Training API using multimodal chat data containing images and text.</description></item><item><title>Vision LLMs</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/vision-overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/vision-overview/</guid><description>Learn how to use the vision models supported by Together AI.</description></item><item><title>Vision Models</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/guides/querying-vision-language-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/guides/querying-vision-language-models/</guid><description>Query vision-language models to analyze images and visual content</description></item><item><title>Vision Tool</title><link>https://learn-ai.blindshot.kz/docs/crewai/en/tools/ai-ml/visiontool/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/crewai/en/tools/ai-ml/visiontool/</guid><description>The &amp;lsquo;VisionTool&amp;rsquo; is designed to extract text from images.</description></item><item><title>Vision-Language Fine-tuning</title><link>https://learn-ai.blindshot.kz/docs/together-ai/docs/fine-tuning-vlm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/together-ai/docs/fine-tuning-vlm/</guid><description>Learn how to fine-tune Vision-Language Models (VLMs) on image+text data using Together AI.</description></item></channel></rss>