Commit graph

141 commits

Author SHA1 Message Date
Armin Ronacher
ae9450dc51 chore(ts): use source import extensions 2026-05-20 00:04:03 +02:00
Mario Zechner
b8f51957a0 fix(ai): add Xiaomi reasoning replay compat
closes #4678
2026-05-18 11:15:20 +02:00
Mario Zechner
ed3904ddd3 fix(ai): switch xiaomi models to openai completions
closes #4505
2026-05-18 00:14:31 +02:00
Mario Zechner
266234047a Remove openai-codex fast model variants, they do not work 2026-05-18 00:02:52 +02:00
Mario Zechner
a01cf7afae
Merge pull request #4603 from mattiacerutti/fix/openai-codex-model-list
fix(ai): update OpenAI Codex model list
2026-05-17 20:54:53 +02:00
Mattia Cerutti
485afc9c3f fix(ai): map copilot gpt minimal thinking to low 2026-05-17 12:18:50 +02:00
Mattia Cerutti
1af823be9d fix(ai): update OpenAI Codex model list 2026-05-17 03:12:19 +02:00
Mario Zechner
2c708492e3 Release v0.74.1 2026-05-17 01:35:45 +02:00
Apoorv Saxena
e2b69a0bb1 fix(ai): mark inception/mercury-2 thinkingLevelMap.off as null
Mercury 2 in instant mode (reasoning_effort: "none") disables tool calling.
The openai-completions provider hardcodes {reasoning:{effort:"none"}} when no
explicit reasoning level is passed and thinkingLevelMap.off isn't null
(openai-completions.ts:575), so every caller that doesn't opt in to a level
silently breaks Mercury 2's agentic use cases.

Setting thinkingLevelMap.off = null on the Mercury 2 catalog entry causes the
provider to omit the reasoning param entirely, letting Mercury 2's own default
take over. Low/medium/high pass through verbatim; OpenRouter normalizes them
to Mercury's vocabulary.

Prefix-matched on "inception/mercury-2" so future Mercury 2 variants on
OpenRouter inherit the fix.
2026-05-13 19:17:36 +05:30
yanirz
99dc6fcec8
fix(ai): add session affinity and compat fixes for Fireworks provider caching
Fireworks prompt caching is enabled by default (automatic prefix matching),
but on serverless infrastructure, requests hit random replicas. Without
session affinity, the per-replica cache misses, negating cache hit rates
and the discounted cacheRead pricing.

Changes:

- Add sendSessionAffinityHeaders and supportsCacheControlOnTools
  to AnthropicMessagesCompat interface
- Send x-session-affinity header for Fireworks (and Cloudflare AI
  Gateway Anthropic) when sessionId is available and caching is enabled
- Omit cache_control on tool definitions for Fireworks (unsupported
  per https://docs.fireworks.ai/tools-sdks/anthropic-compatibility)
- Default supportsEagerToolInputStreaming to false for Fireworks
  (unsupported field)
- Default supportsLongCacheRetention to false for Fireworks
  (cache_control.ttl not supported)
- Add compat settings to Fireworks models in generate-models.ts
- Update generated models with Fireworks compat settings
- Add integration tests for session affinity and tool compat

Refs: https://docs.fireworks.ai/guides/prompt-caching
Refs: https://docs.fireworks.ai/tools-sdks/anthropic-compatibility
2026-05-10 00:11:36 +02:00
Mario Zechner
7adb8e7634 feat(ai): add Together AI provider
Some checks are pending
CI / build-check-test (push) Waiting to run
2026-05-08 16:44:18 +02:00
Mario Zechner
9751057be9
Merge pull request #3887 from cristinaponcela/feat/image-outputs
feat: image content
2026-05-08 15:57:06 +02:00
Mario Zechner
783e96a144 fix(ai): disable OpenAI reasoning where supported 2026-05-07 22:52:54 +02:00
Cristina Poncela Cubeiro
ffdf426e33 Merge remote-tracking branch 'upstream/main' into feat/image-outputs 2026-05-07 16:46:17 +02:00
Armin Ronacher
ace3fd7e30 fix(ai): normalize kimi k2p6 alias to kimi-for-coding closes #4218
Some checks are pending
CI / build-check-test (push) Waiting to run
2026-05-06 11:34:49 +02:00
Cristina Poncela Cubeiro
5731e13a61
Merge branch 'main' into feat/image-outputs 2026-05-04 16:42:51 +01:00
Cristina Poncela Cubeiro
63c61aac6f feat: image models 2026-05-04 17:07:02 +02:00
Cristina Poncela Cubeiro
cbf3c333ef revert 2026-05-04 15:43:45 +02:00
Jake Jia
693888ac47
feat(ai): switch xiaomi default to api billing, add per-region token plan providers (#4112)
Some checks are pending
CI / build-check-test (push) Waiting to run
Built-in `xiaomi` provider now targets the API billing endpoint (https://api.xiaomimimo.com/anthropic) — a single stable URL for keys issued at platform.xiaomimimo.com. The Token Plan endpoints are exposed as three sibling providers, each with its own env var:

- xiaomi-token-plan-cn: XIAOMI_TOKEN_PLAN_CN_API_KEY
- xiaomi-token-plan-ams: XIAOMI_TOKEN_PLAN_AMS_API_KEY
- xiaomi-token-plan-sgp: XIAOMI_TOKEN_PLAN_SGP_API_KEY

BREAKING CHANGE: users who previously set XIAOMI_API_KEY against the Token Plan AMS endpoint must move to xiaomi-token-plan-ams and set XIAOMI_TOKEN_PLAN_AMS_API_KEY. This also resolves the 401 reported by on #4005, where a platform.xiaomimimo.com key fails against the Token Plan endpoint.

closes #4082
2026-05-03 12:57:11 +02:00
Jakub Synowiec
c8edb256b9
fix(ai): fix mismatch between models.dev and OpenCode Go (Qwen3.5/3.6, MiniMax M2.7) (#4110)
Some checks are pending
CI / build-check-test (push) Waiting to run
2026-05-03 00:41:55 +02:00
Mario Zechner
c0e046990e fix(ai): use Xiaomi Token Plan Anthropic endpoint
closes #3912
2026-05-02 01:36:34 +02:00
Mario Zechner
80f06d3636 feat: add model thinking level metadata
closes #3208
2026-05-02 01:21:06 +02:00
Jake Jia
a44622670f
feat(ai): add Xiaomi MiMo provider (#4005)
* fix(ai): include minimax-cn in cross-provider-handoff matrix

* feat(ai): add Xiaomi MiMo provider

Adds Xiaomi MiMo as an openai-completions-compatible provider.

- packages/ai: register provider in types/KnownProvider, env-api-keys (XIAOMI_API_KEY), generate-models, models.generated.ts, overflow util, README, CHANGELOG
- packages/ai/test: extend stream, tokens, abort, empty, context-overflow, overflow, image-tool-result, tool-call-without-result, total-tokens, unicode-surrogate, cross-provider-handoff matrices with Xiaomi
- packages/coding-agent: default model (mimo-v2.5-pro), display name (Xiaomi MiMo), CLI env var docs, README, docs/providers.md

closes #3912

---------

Co-authored-by: Mario Zechner <badlogicgames@gmail.com>
2026-05-02 00:46:05 +02:00
Mario Zechner
f5b6e4fab0 fix(ai): handle OpenRouter DeepSeek V4 reasoning
Closes #4055

Closes #4047
2026-05-01 22:19:06 +02:00
Mario Zechner
a45577bd00 fix(ai): finalize cloudflare gateway provider support 2026-05-01 00:56:05 +02:00
MC
24fb6b833b
feat(ai): add Cloudflare AI Gateway as a provider (#3856)
* feat(ai): add Cloudflare AI Gateway as a provider

Routes through Cloudflare's Unified API (`/compat`) for Workers AI and
Anthropic models, and through the provider-specific `/openai` subpath
for OpenAI models so reasoning models (gpt-5.x, o-series) can hit
`/v1/responses` natively. Once `/compat` adds Responses-API support,
the OpenAI subpath can be folded back in.

Catalog layout:
  workers-ai/@cf/...  -> openai-completions, gateway/.../compat
  anthropic/...       -> openai-completions, gateway/.../compat
  <native-id>         -> openai-responses,   gateway/.../openai
                         (gpt-5.1, claude-... no, sorry: gpt-5.x and o-series only;
                          prefix stripped because the OpenAI SDK posts native ids)

Touches:
  packages/ai/src/types.ts                       add cloudflare-ai-gateway to KnownProvider
  packages/ai/src/env-api-keys.ts                map to CLOUDFLARE_API_KEY
  packages/ai/src/providers/cloudflare.ts        add CLOUDFLARE_AI_GATEWAY_COMPAT_BASE_URL
                                                 and CLOUDFLARE_AI_GATEWAY_OPENAI_BASE_URL
  packages/ai/src/providers/openai-responses.ts  one-line dispatch through resolveCloudflareBaseUrl
                                                 (matches what openai-completions.ts already does)
  packages/ai/scripts/generate-models.ts         branch openai/* vs workers-ai/anthropic/*
  packages/ai/src/models.generated.ts            spliced 34 entries
  packages/ai/test/stream.test.ts                3 e2e blocks (one per upstream)
  packages/coding-agent/*                        defaultModelPerProvider, login, env docs,
                                                 README, providers.md

Verified end-to-end against a real Cloudflare account with unified
billing: 9/9 e2e tests pass across all three upstreams (Workers AI
Kimi K2.6, OpenAI gpt-5.1 reasoning, Anthropic claude-sonnet-4-5).

* refactor(ai): move AI Gateway User-Agent and per-route session-affinity flag to catalog

Mirrors the same per-model metadata refactor done for Workers AI in the
parent branch. All cloudflare-ai-gateway entries get the User-Agent
header. Only workers-ai/* gateway entries set
`compat.sendSessionAffinityHeaders: true` because the gateway
forwards that header to the underlying Workers AI runtime; anthropic/*
upstream and openai/* (openai-responses) don't use it.

  packages/ai/scripts/generate-models.ts: emit headers (always) and
  per-upstream compat (workers-ai only) on each cloudflare-ai-gateway
  entry.
  packages/ai/src/models.generated.ts: re-spliced 35 entries with
  headers + conditional compat.

Behavior unchanged - 9/9 e2e tests pass across all three upstream
families.

* fix(ai): align AI Gateway with telemetry-aware UA helper

Adapts to badlogic/pi-mono#3851's follow-up fix ("honor telemetry for
Cloudflare attribution headers", fbb5eed) which moved the
'User-Agent: pi-coding-agent' header out of per-model catalog metadata
and into a centralized telemetry-honoring helper
(coding-agent/src/core/sdk.ts:getAttributionHeaders).

- packages/coding-agent/src/core/sdk.ts: extend the cloudflare branch of
  getAttributionHeaders to also match cloudflare-ai-gateway and
  gateway.ai.cloudflare.com.

- packages/ai/scripts/generate-models.ts and src/models.generated.ts:
  drop 'headers' from the 35 cloudflare-ai-gateway entries (constant
  CLOUDFLARE_STATIC_HEADERS no longer exists). Per-route
  compat.sendSessionAffinityHeaders is unchanged.

End-to-end behavior unchanged: 9/9 tests still pass across all three
upstream families (Workers AI, Anthropic, OpenAI Responses).

---------

Co-authored-by: Mario Zechner <badlogicgames@gmail.com>
2026-04-30 23:29:37 +02:00
Mario Zechner
fe66edd943 remove gemini cli and antigravity support 2026-04-30 21:24:36 +02:00
Armin Ronacher
7dc1bed478 feat(ai): add Moonshot AI provider model support
Some checks are pending
CI / build-check-test (push) Waiting to run
2026-04-30 17:21:03 +02:00
Johannes Ebeling
779d0ef39d
feat(ai): add Mistral Medium 3.5 model (#4009) 2026-04-30 12:18:17 +02:00
Mario Zechner
c1dd6082ee fix(ai): apply DeepSeek V4 reasoning compat
closes #3940
2026-04-29 23:25:03 +02:00
Mario Zechner
ae81deb4c3 fix(ai): correct DeepSeek V4 pricing metadata
closes #3910
2026-04-29 22:58:19 +02:00
Cristina Poncela Cubeiro
e9414b0500 fix 2026-04-29 09:49:09 +02:00
Cristina Poncela Cubeiro
c3c10737d8 feat: image content 2026-04-28 13:43:28 +02:00
Mario Zechner
fbb5eed191 fix: honor telemetry for Cloudflare attribution headers
Some checks are pending
CI / build-check-test (push) Waiting to run
2026-04-27 23:49:14 +02:00
MC
d6e08b3da0
feat(ai): add Cloudflare Workers AI as a provider (#3851)
* feat(ai): add Cloudflare Workers AI as a provider

Cloudflare Workers AI hosts open-weight LLMs (Kimi K2.6, GPT-OSS,
GLM-4.7, Llama 4, Gemma 4, Nemotron 3) on Cloudflare's GPU network with
an OpenAI-compatible endpoint. Reuses the openai-completions API
protocol; the per-account URL contains a {CLOUDFLARE_ACCOUNT_ID}
placeholder resolved at request time by a small helper.

Pi automatically sets x-session-affinity for prefix caching:
https://developers.cloudflare.com/workers-ai/features/prompt-caching/

Auth: CLOUDFLARE_API_KEY (matches pi's *_API_KEY convention) +
CLOUDFLARE_ACCOUNT_ID. The User-Agent identifies traffic as
'pi-coding-agent' in Cloudflare analytics.

Verified end-to-end against a real Cloudflare account: 17 e2e tests
pass across stream/empty/tokens/unicode/tool-call-without-result/
total-tokens against @cf/moonshotai/kimi-k2.6.

Cloudflare AI Gateway is a separate, larger change (it requires routing
through provider-specific subpaths with the matching API protocol per
upstream) and will land in a follow-up PR.

* refactor(ai): move Cloudflare User-Agent and session-affinity flag to per-model metadata

Instead of conditionally setting them in openai-completions.ts based on
provider detection, declare them as model-level fields in the catalog
(headers + compat). This is consistent with how the github-copilot and
kimi-coding entries already declare their static headers.

  packages/ai/scripts/generate-models.ts: emit headers and compat fields
  on each cloudflare-workers-ai entry (CLOUDFLARE_STATIC_HEADERS).
  packages/ai/src/providers/openai-completions.ts: drop the
  isCloudflareProvider conditional that injected User-Agent and the
  isCloudflareWorkersAI override of sendSessionAffinityHeaders.
  packages/ai/src/models.generated.ts: re-spliced 8 cloudflare-workers-ai
  entries with headers + compat.

Behavior is unchanged - verified via fetch interceptor that User-Agent
and x-session-affinity / session_id / x-client-request-id are still sent
on outbound requests. 5/5 e2e tests pass.
2026-04-27 23:41:54 +02:00
Mario Zechner
9b103e5e41 fix(ai): replay DeepSeek V4 reasoning content
closes #3668
2026-04-24 19:32:51 +02:00
Mario Zechner
1e33492525 fix(coding-agent): harden clipboard copy
closes #3639
2026-04-24 12:55:58 +02:00
Mario Zechner
c96c2fcd1e fix(ai): correct gpt-5.5 context metadata 2026-04-24 10:42:44 +02:00
Mario Zechner
1312346199 fix(ai): expand Copilot eager streaming compat
closes #3575
2026-04-23 23:43:00 +02:00
Mario Zechner
ffa0f31239 fix(ai): support Anthropic eager tool streaming compat
closes #3575
2026-04-23 23:12:45 +02:00
Mario Zechner
f70d041e71 feat(ai): add GPT-5.5 Codex model 2026-04-23 21:36:16 +02:00
Mario Zechner
f0ebb327f2 fix(ai): set default Kimi Coding user agent closes #3586 2026-04-23 12:12:54 +02:00
Mario Zechner
6553141f69 fix(ai): add gemini 3.1 flash lite cloud code assist model
closes #3545
2026-04-22 19:13:51 +02:00
Mario Zechner
0bb0a58466 feat(ai): add Fireworks provider support closes #3519 2026-04-22 01:09:11 +02:00
Mario Zechner
a0a16c7762 fix(amazon-bedrock): restore regional endpoint resolution
Some checks are pending
CI / build-check-test (push) Waiting to run
closes #3481
closes #3485
closes #3486
closes #3487
closes #3488
2026-04-21 13:20:10 +02:00
Mario Zechner
3054fd7a3b fix(ai,coding-agent): support anthropic-style cache control for openai compatibles closes #3392 2026-04-20 17:12:05 +02:00
Armin Ronacher
a91978cf19 fix(ai): add temporary Anthropic Opus 4.7 model override 2026-04-16 17:06:23 +02:00
Mario Zechner
eb1cf80b10 fix(ai,coding-agent): replace deprecated kimi k2p5 model closes #3242 2026-04-16 12:06:24 +02:00
Vladyslav Tkachenko
a9bd8045d6
fix: update zai processing logic (#2855)
* feat: add new models and update zai processing logic

* chore(ai): removed overrides, simplify provider pick

---------

Co-authored-by: Mario Zechner <badlogicgames@gmail.com>
2026-04-05 23:44:17 +02:00
Kao Félix
758ede4da0
Enable tool streaming for newer Z.ai models (#2732) 2026-03-31 14:28:24 +02:00