pi-mono/packages/ai/scripts
yanirz 99dc6fcec8
fix(ai): add session affinity and compat fixes for Fireworks provider caching
Fireworks prompt caching is enabled by default (automatic prefix matching),
but on serverless infrastructure, requests hit random replicas. Without
session affinity, the per-replica cache misses, negating cache hit rates
and the discounted cacheRead pricing.

Changes:

- Add sendSessionAffinityHeaders and supportsCacheControlOnTools
  to AnthropicMessagesCompat interface
- Send x-session-affinity header for Fireworks (and Cloudflare AI
  Gateway Anthropic) when sessionId is available and caching is enabled
- Omit cache_control on tool definitions for Fireworks (unsupported
  per https://docs.fireworks.ai/tools-sdks/anthropic-compatibility)
- Default supportsEagerToolInputStreaming to false for Fireworks
  (unsupported field)
- Default supportsLongCacheRetention to false for Fireworks
  (cache_control.ttl not supported)
- Add compat settings to Fireworks models in generate-models.ts
- Update generated models with Fireworks compat settings
- Add integration tests for session affinity and tool compat

Refs: https://docs.fireworks.ai/guides/prompt-caching
Refs: https://docs.fireworks.ai/tools-sdks/anthropic-compatibility
2026-05-10 00:11:36 +02:00
..
generate-image-models.ts feat: image models 2026-05-04 17:07:02 +02:00
generate-models.ts fix(ai): add session affinity and compat fixes for Fireworks provider caching 2026-05-10 00:11:36 +02:00
generate-test-image.ts feat(ai): Add image input tests for vision-capable models 2025-08-30 18:37:17 +02:00