<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Caching on AI Knowledge Base</title><link>https://learn-ai.blindshot.kz/topics/caching/</link><description>Recent content in Caching on AI Knowledge Base</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://learn-ai.blindshot.kz/topics/caching/index.xml" rel="self" type="application/rss+xml"/><item><title>Cost Optimization</title><link>https://learn-ai.blindshot.kz/paths/cost-optimization/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/paths/cost-optimization/</guid><description>&lt;p&gt;Minimize AI application costs without sacrificing quality. This path covers the complete cost optimization toolkit: model selection, token counting, prompt caching, and batch processing — comparing approaches across Anthropic and OpenAI.&lt;/p&gt;
&lt;p&gt;The key insight: cost optimization is not about using cheaper models everywhere. It&amp;rsquo;s about matching the right model to each task, caching repeated content, batching non-urgent work, and measuring token usage to eliminate waste. A well-optimized pipeline using GPT-4o-mini + caching can cost less than a naive GPT-3.5 implementation.&lt;/p&gt;</description></item><item><title>Prompt Caching</title><link>https://learn-ai.blindshot.kz/docs/anthropic/platform/build-with-claude/prompt-caching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/anthropic/platform/build-with-claude/prompt-caching/</guid><description>Cache system prompts and repeated context to reduce latency and costs by up to 90%.</description></item><item><title>Cache</title><link>https://learn-ai.blindshot.kz/docs/dspy/tutorials/cache/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/dspy/tutorials/cache/_overview/</guid><description/></item><item><title>Caching</title><link>https://learn-ai.blindshot.kz/docs/instructor/concepts/caching/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/instructor/concepts/caching/_overview/</guid><description/></item><item><title>Caching</title><link>https://learn-ai.blindshot.kz/docs/ragas/howtos/customizations/_caching/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/ragas/howtos/customizations/_caching/_overview/</guid><description/></item><item><title>Context caching</title><link>https://learn-ai.blindshot.kz/docs/google/adk/context/caching/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/google/adk/context/caching/_overview/</guid><description/></item><item><title>Distributed Architecture</title><link>https://learn-ai.blindshot.kz/docs/chroma/reference/architecture/distributed/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/chroma/reference/architecture/distributed/</guid><description>How Chroma scales out with independent services, object storage, SSD caches, and a shared system database.</description></item><item><title>Kv Cache</title><link>https://learn-ai.blindshot.kz/docs/deepseek/guides/kv_cache/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/deepseek/guides/kv_cache/</guid><description/></item><item><title>Prompt caching</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/guides/prompt-caching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/guides/prompt-caching/</guid><description/></item><item><title>Prompt caching</title><link>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/prompt-caching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/openai/api/api/docs/guides/prompt-caching/</guid><description>Learn how prompt caching reduces latency and cost for long prompts in OpenAI&amp;rsquo;s API.</description></item><item><title>Prompt Caching</title><link>https://learn-ai.blindshot.kz/docs/instructor/concepts/prompt_caching/_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/instructor/concepts/prompt_caching/_overview/</guid><description/></item><item><title>Serverless Overview</title><link>https://learn-ai.blindshot.kz/docs/fireworks-ai/serverless/overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/fireworks-ai/serverless/overview/</guid><description>How Serverless inference works on Fireworks: serving paths, billing, request/response headers, prompt caching, model lifecycle, and when to choose Serverless over On-demand</description></item><item><title>Tool Use With Prompt Caching</title><link>https://learn-ai.blindshot.kz/docs/anthropic/platform/agents-and-tools/tool-use/tool-use-with-prompt-caching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/anthropic/platform/agents-and-tools/tool-use/tool-use-with-prompt-caching/</guid><description/></item><item><title>Troubleshoot variable caching</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/troubleshooting-variable-caching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/troubleshooting-variable-caching/</guid><description/></item><item><title>Use server-side caching</title><link>https://learn-ai.blindshot.kz/docs/langchain/langsmith/caching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://learn-ai.blindshot.kz/docs/langchain/langsmith/caching/</guid><description>Cache values server-side in your agent deployment using stale-while-revalidate and key-value cache APIs.</description></item></channel></rss>