pi-mono

mirror of https://github.com/badlogic/pi-mono.git synced 2026-06-01 14:39:47 +00:00

Author	SHA1	Message	Date
yanirz	99dc6fcec8	fix(ai): add session affinity and compat fixes for Fireworks provider caching Fireworks prompt caching is enabled by default (automatic prefix matching), but on serverless infrastructure, requests hit random replicas. Without session affinity, the per-replica cache misses, negating cache hit rates and the discounted cacheRead pricing. Changes: - Add sendSessionAffinityHeaders and supportsCacheControlOnTools to AnthropicMessagesCompat interface - Send x-session-affinity header for Fireworks (and Cloudflare AI Gateway Anthropic) when sessionId is available and caching is enabled - Omit cache_control on tool definitions for Fireworks (unsupported per https://docs.fireworks.ai/tools-sdks/anthropic-compatibility) - Default supportsEagerToolInputStreaming to false for Fireworks (unsupported field) - Default supportsLongCacheRetention to false for Fireworks (cache_control.ttl not supported) - Add compat settings to Fireworks models in generate-models.ts - Update generated models with Fireworks compat settings - Add integration tests for session affinity and tool compat Refs: https://docs.fireworks.ai/guides/prompt-caching Refs: https://docs.fireworks.ai/tools-sdks/anthropic-compatibility	2026-05-10 00:11:36 +02:00
Mario Zechner	7adb8e7634	feat(ai): add Together AI provider Some checks are pending CI / build-check-test (push) Waiting to run Details	2026-05-08 16:44:18 +02:00
Mario Zechner	9751057be9	Merge pull request #3887 from cristinaponcela/feat/image-outputs feat: image content	2026-05-08 15:57:06 +02:00
Mario Zechner	783e96a144	fix(ai): disable OpenAI reasoning where supported	2026-05-07 22:52:54 +02:00
Cristina Poncela Cubeiro	ffdf426e33	Merge remote-tracking branch 'upstream/main' into feat/image-outputs	2026-05-07 16:46:17 +02:00
Armin Ronacher	ace3fd7e30	fix(ai): normalize kimi k2p6 alias to kimi-for-coding closes #4218 Some checks are pending CI / build-check-test (push) Waiting to run Details	2026-05-06 11:34:49 +02:00
Cristina Poncela Cubeiro	5731e13a61	Merge branch 'main' into feat/image-outputs	2026-05-04 16:42:51 +01:00
Cristina Poncela Cubeiro	63c61aac6f	feat: image models	2026-05-04 17:07:02 +02:00
Cristina Poncela Cubeiro	cbf3c333ef	revert	2026-05-04 15:43:45 +02:00
Jake Jia	693888ac47	feat(ai): switch xiaomi default to api billing, add per-region token plan providers (#4112 ) Some checks are pending CI / build-check-test (push) Waiting to run Details Built-in `xiaomi` provider now targets the API billing endpoint (https://api.xiaomimimo.com/anthropic) — a single stable URL for keys issued at platform.xiaomimimo.com. The Token Plan endpoints are exposed as three sibling providers, each with its own env var: - xiaomi-token-plan-cn: XIAOMI_TOKEN_PLAN_CN_API_KEY - xiaomi-token-plan-ams: XIAOMI_TOKEN_PLAN_AMS_API_KEY - xiaomi-token-plan-sgp: XIAOMI_TOKEN_PLAN_SGP_API_KEY BREAKING CHANGE: users who previously set XIAOMI_API_KEY against the Token Plan AMS endpoint must move to xiaomi-token-plan-ams and set XIAOMI_TOKEN_PLAN_AMS_API_KEY. This also resolves the 401 reported by on #4005, where a platform.xiaomimimo.com key fails against the Token Plan endpoint. closes #4082	2026-05-03 12:57:11 +02:00
Jakub Synowiec	c8edb256b9	fix(ai): fix mismatch between models.dev and OpenCode Go (Qwen3.5/3.6, MiniMax M2.7) (#4110 ) Some checks are pending CI / build-check-test (push) Waiting to run Details	2026-05-03 00:41:55 +02:00
Mario Zechner	c0e046990e	fix(ai): use Xiaomi Token Plan Anthropic endpoint closes #3912	2026-05-02 01:36:34 +02:00
Mario Zechner	80f06d3636	feat: add model thinking level metadata closes #3208	2026-05-02 01:21:06 +02:00
Jake Jia	a44622670f	feat(ai): add Xiaomi MiMo provider (#4005 ) * fix(ai): include minimax-cn in cross-provider-handoff matrix * feat(ai): add Xiaomi MiMo provider Adds Xiaomi MiMo as an openai-completions-compatible provider. - packages/ai: register provider in types/KnownProvider, env-api-keys (XIAOMI_API_KEY), generate-models, models.generated.ts, overflow util, README, CHANGELOG - packages/ai/test: extend stream, tokens, abort, empty, context-overflow, overflow, image-tool-result, tool-call-without-result, total-tokens, unicode-surrogate, cross-provider-handoff matrices with Xiaomi - packages/coding-agent: default model (mimo-v2.5-pro), display name (Xiaomi MiMo), CLI env var docs, README, docs/providers.md closes #3912 --------- Co-authored-by: Mario Zechner <badlogicgames@gmail.com>	2026-05-02 00:46:05 +02:00
Mario Zechner	f5b6e4fab0	fix(ai): handle OpenRouter DeepSeek V4 reasoning Closes #4055 Closes #4047	2026-05-01 22:19:06 +02:00
Mario Zechner	a45577bd00	fix(ai): finalize cloudflare gateway provider support	2026-05-01 00:56:05 +02:00
MC	24fb6b833b	feat(ai): add Cloudflare AI Gateway as a provider (#3856 ) * feat(ai): add Cloudflare AI Gateway as a provider Routes through Cloudflare's Unified API (`/compat`) for Workers AI and Anthropic models, and through the provider-specific `/openai` subpath for OpenAI models so reasoning models (gpt-5.x, o-series) can hit `/v1/responses` natively. Once `/compat` adds Responses-API support, the OpenAI subpath can be folded back in. Catalog layout: workers-ai/@cf/... -> openai-completions, gateway/.../compat anthropic/... -> openai-completions, gateway/.../compat <native-id> -> openai-responses, gateway/.../openai (gpt-5.1, claude-... no, sorry: gpt-5.x and o-series only; prefix stripped because the OpenAI SDK posts native ids) Touches: packages/ai/src/types.ts add cloudflare-ai-gateway to KnownProvider packages/ai/src/env-api-keys.ts map to CLOUDFLARE_API_KEY packages/ai/src/providers/cloudflare.ts add CLOUDFLARE_AI_GATEWAY_COMPAT_BASE_URL and CLOUDFLARE_AI_GATEWAY_OPENAI_BASE_URL packages/ai/src/providers/openai-responses.ts one-line dispatch through resolveCloudflareBaseUrl (matches what openai-completions.ts already does) packages/ai/scripts/generate-models.ts branch openai/* vs workers-ai/anthropic/* packages/ai/src/models.generated.ts spliced 34 entries packages/ai/test/stream.test.ts 3 e2e blocks (one per upstream) packages/coding-agent/* defaultModelPerProvider, login, env docs, README, providers.md Verified end-to-end against a real Cloudflare account with unified billing: 9/9 e2e tests pass across all three upstreams (Workers AI Kimi K2.6, OpenAI gpt-5.1 reasoning, Anthropic claude-sonnet-4-5). * refactor(ai): move AI Gateway User-Agent and per-route session-affinity flag to catalog Mirrors the same per-model metadata refactor done for Workers AI in the parent branch. All cloudflare-ai-gateway entries get the User-Agent header. Only workers-ai/* gateway entries set `compat.sendSessionAffinityHeaders: true` because the gateway forwards that header to the underlying Workers AI runtime; anthropic/* upstream and openai/* (openai-responses) don't use it. packages/ai/scripts/generate-models.ts: emit headers (always) and per-upstream compat (workers-ai only) on each cloudflare-ai-gateway entry. packages/ai/src/models.generated.ts: re-spliced 35 entries with headers + conditional compat. Behavior unchanged - 9/9 e2e tests pass across all three upstream families. * fix(ai): align AI Gateway with telemetry-aware UA helper Adapts to badlogic/pi-mono#3851's follow-up fix ("honor telemetry for Cloudflare attribution headers", `fbb5eed`) which moved the 'User-Agent: pi-coding-agent' header out of per-model catalog metadata and into a centralized telemetry-honoring helper (coding-agent/src/core/sdk.ts:getAttributionHeaders). - packages/coding-agent/src/core/sdk.ts: extend the cloudflare branch of getAttributionHeaders to also match cloudflare-ai-gateway and gateway.ai.cloudflare.com. - packages/ai/scripts/generate-models.ts and src/models.generated.ts: drop 'headers' from the 35 cloudflare-ai-gateway entries (constant CLOUDFLARE_STATIC_HEADERS no longer exists). Per-route compat.sendSessionAffinityHeaders is unchanged. End-to-end behavior unchanged: 9/9 tests still pass across all three upstream families (Workers AI, Anthropic, OpenAI Responses). --------- Co-authored-by: Mario Zechner <badlogicgames@gmail.com>	2026-04-30 23:29:37 +02:00
Mario Zechner	fe66edd943	remove gemini cli and antigravity support	2026-04-30 21:24:36 +02:00
Armin Ronacher	7dc1bed478	feat(ai): add Moonshot AI provider model support Some checks are pending CI / build-check-test (push) Waiting to run Details	2026-04-30 17:21:03 +02:00
Johannes Ebeling	779d0ef39d	feat(ai): add Mistral Medium 3.5 model (#4009 )	2026-04-30 12:18:17 +02:00
Mario Zechner	c1dd6082ee	fix(ai): apply DeepSeek V4 reasoning compat closes #3940	2026-04-29 23:25:03 +02:00
Mario Zechner	ae81deb4c3	fix(ai): correct DeepSeek V4 pricing metadata closes #3910	2026-04-29 22:58:19 +02:00
Cristina Poncela Cubeiro	e9414b0500	fix	2026-04-29 09:49:09 +02:00
Cristina Poncela Cubeiro	c3c10737d8	feat: image content	2026-04-28 13:43:28 +02:00
Mario Zechner	fbb5eed191	fix: honor telemetry for Cloudflare attribution headers Some checks are pending CI / build-check-test (push) Waiting to run Details	2026-04-27 23:49:14 +02:00
MC	d6e08b3da0	feat(ai): add Cloudflare Workers AI as a provider (#3851 ) * feat(ai): add Cloudflare Workers AI as a provider Cloudflare Workers AI hosts open-weight LLMs (Kimi K2.6, GPT-OSS, GLM-4.7, Llama 4, Gemma 4, Nemotron 3) on Cloudflare's GPU network with an OpenAI-compatible endpoint. Reuses the openai-completions API protocol; the per-account URL contains a {CLOUDFLARE_ACCOUNT_ID} placeholder resolved at request time by a small helper. Pi automatically sets x-session-affinity for prefix caching: https://developers.cloudflare.com/workers-ai/features/prompt-caching/ Auth: CLOUDFLARE_API_KEY (matches pi's _API_KEY convention) + CLOUDFLARE_ACCOUNT_ID. The User-Agent identifies traffic as 'pi-coding-agent' in Cloudflare analytics. Verified end-to-end against a real Cloudflare account: 17 e2e tests pass across stream/empty/tokens/unicode/tool-call-without-result/ total-tokens against @cf/moonshotai/kimi-k2.6. Cloudflare AI Gateway is a separate, larger change (it requires routing through provider-specific subpaths with the matching API protocol per upstream) and will land in a follow-up PR. refactor(ai): move Cloudflare User-Agent and session-affinity flag to per-model metadata Instead of conditionally setting them in openai-completions.ts based on provider detection, declare them as model-level fields in the catalog (headers + compat). This is consistent with how the github-copilot and kimi-coding entries already declare their static headers. packages/ai/scripts/generate-models.ts: emit headers and compat fields on each cloudflare-workers-ai entry (CLOUDFLARE_STATIC_HEADERS). packages/ai/src/providers/openai-completions.ts: drop the isCloudflareProvider conditional that injected User-Agent and the isCloudflareWorkersAI override of sendSessionAffinityHeaders. packages/ai/src/models.generated.ts: re-spliced 8 cloudflare-workers-ai entries with headers + compat. Behavior is unchanged - verified via fetch interceptor that User-Agent and x-session-affinity / session_id / x-client-request-id are still sent on outbound requests. 5/5 e2e tests pass.	2026-04-27 23:41:54 +02:00
Mario Zechner	9b103e5e41	fix(ai): replay DeepSeek V4 reasoning content closes #3668	2026-04-24 19:32:51 +02:00
Mario Zechner	1e33492525	fix(coding-agent): harden clipboard copy closes #3639	2026-04-24 12:55:58 +02:00
Mario Zechner	c96c2fcd1e	fix(ai): correct gpt-5.5 context metadata	2026-04-24 10:42:44 +02:00
Mario Zechner	1312346199	fix(ai): expand Copilot eager streaming compat closes #3575	2026-04-23 23:43:00 +02:00
Mario Zechner	ffa0f31239	fix(ai): support Anthropic eager tool streaming compat closes #3575	2026-04-23 23:12:45 +02:00
Mario Zechner	f70d041e71	feat(ai): add GPT-5.5 Codex model	2026-04-23 21:36:16 +02:00
Mario Zechner	f0ebb327f2	fix(ai): set default Kimi Coding user agent closes #3586	2026-04-23 12:12:54 +02:00
Mario Zechner	6553141f69	fix(ai): add gemini 3.1 flash lite cloud code assist model closes #3545	2026-04-22 19:13:51 +02:00
Mario Zechner	0bb0a58466	feat(ai): add Fireworks provider support closes #3519	2026-04-22 01:09:11 +02:00
Mario Zechner	a0a16c7762	fix(amazon-bedrock): restore regional endpoint resolution Some checks are pending CI / build-check-test (push) Waiting to run Details closes #3481 closes #3485 closes #3486 closes #3487 closes #3488	2026-04-21 13:20:10 +02:00
Mario Zechner	3054fd7a3b	fix(ai,coding-agent): support anthropic-style cache control for openai compatibles closes #3392	2026-04-20 17:12:05 +02:00
Armin Ronacher	a91978cf19	fix(ai): add temporary Anthropic Opus 4.7 model override	2026-04-16 17:06:23 +02:00
Mario Zechner	eb1cf80b10	fix(ai,coding-agent): replace deprecated kimi k2p5 model closes #3242	2026-04-16 12:06:24 +02:00
Vladyslav Tkachenko	a9bd8045d6	fix: update zai processing logic (#2855 ) * feat: add new models and update zai processing logic * chore(ai): removed overrides, simplify provider pick --------- Co-authored-by: Mario Zechner <badlogicgames@gmail.com>	2026-04-05 23:44:17 +02:00
Kao Félix	758ede4da0	Enable tool streaming for newer Z.ai models (#2732 )	2026-03-31 14:28:24 +02:00
Gordon Hui	17625cc8a2	feat(ai): add google-vertex gemini-3.1-pro-preview-customtools (#2610 )	2026-03-27 02:45:58 +01:00
Mario Zechner	6dc43d6dd1	fix(ai): prune deprecated direct minimax models	2026-03-25 22:44:32 +01:00
简简简简	8705fbee54	fix(models): align minimax and zai defaults (#2445 ) Update the coding-agent default model picks for ZAI, Cerebras, and MiniMax so new sessions prefer the current model lineup. Add the missing MiniMax-M2.1-highspeed direct provider entries and normalize MiniMax Anthropic-compatible context limits so the catalog matches the provider's supported model set.	2026-03-20 10:00:57 +01:00
Jheng-Hong Yang	68da22f18c	feat(ai): add openai-codex gpt-5.4-mini (#2334 )	2026-03-18 11:22:11 +01:00
Mario Zechner	a9f534adfc	fix(ai): restore antigravity context override block	2026-03-18 00:16:01 +01:00
Mario Zechner	d70dfbeb3e	fix(ai): correct Bedrock Claude 4.6 context window to 200k Bedrock Claude Opus 4.6 and Sonnet 4.6 models have 200k context window, not 1M. Removed incorrect overrides that were forcing these models to 1M. The native Anthropic API models correctly remain at 1M. closes #2305	2026-03-18 00:08:50 +01:00
Mario Zechner	747521227f	fix(ai): correct Claude 4.6 context overrides closes #2286	2026-03-17 12:32:03 +01:00
Armin Ronacher	d574c03e19	Raise context length to 1M (#2135 )	2026-03-14 01:34:25 +01:00
Mario Zechner	e4172e68d0	feat(ai): add claude-sonnet-4-6 to Antigravity, fix Claude thinking header detection, bump UA to 1.18.4 closes #1859	2026-03-06 13:27:35 +01:00

1 2 3

132 commits