diff --git a/docs/src/ai/models.md b/docs/src/ai/models.md index e2d0c1c83cd..f582f25af01 100644 --- a/docs/src/ai/models.md +++ b/docs/src/ai/models.md @@ -91,6 +91,8 @@ As of February 19, 2026, Zed Pro serves newer model versions in place of the ret ## Usage {#usage} +Because Zed-hosted Gemini models do not use Google context caching, Gemini usage is billed only as input and output tokens; there is no separate cached-input price for these models. This preserves zero-data-retention behavior for hosted Gemini requests. For background, see Google's Vertex AI documentation on [context caching](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview) and [zero data retention](https://cloud.google.com/vertex-ai/generative-ai/docs/vertex-ai-zero-data-retention). + Any usage of a Zed-hosted model will be billed at the Zed Price (rightmost column above). See [Plans and Usage](./plans-and-usage.md) for details on Zed's plans and limits for use of hosted models. > LLMs can enter unproductive loops that require user intervention. Monitor longer-running tasks and interrupt if needed.