Adds a Context Overflow Errors subsection to custom-provider.md showing how a custom provider extension can rewrite unrecognized overflow errors via a message_end handler so pi's native compact-and-retry path triggers. Includes guard rails for provider scoping, idempotency, and avoiding rate-limit phrases.
The notify() API only accepts "info" | "warning" | "error" — "success"
was a copy-paste error in the example.
Co-Authored-By: julien-agent <Agents+cyolo@huggingface.co>
Built-in `xiaomi` provider now targets the API billing endpoint (https://api.xiaomimimo.com/anthropic) — a single stable URL for keys issued at platform.xiaomimimo.com. The Token Plan endpoints are exposed as three sibling providers, each with its own env var:
- xiaomi-token-plan-cn: XIAOMI_TOKEN_PLAN_CN_API_KEY
- xiaomi-token-plan-ams: XIAOMI_TOKEN_PLAN_AMS_API_KEY
- xiaomi-token-plan-sgp: XIAOMI_TOKEN_PLAN_SGP_API_KEY
BREAKING CHANGE: users who previously set XIAOMI_API_KEY against the Token Plan AMS endpoint must move to xiaomi-token-plan-ams and set XIAOMI_TOKEN_PLAN_AMS_API_KEY. This also resolves the 401 reported by on #4005, where a platform.xiaomimimo.com key fails against the Token Plan endpoint.
closes#4082
* feat(ai): add Cloudflare AI Gateway as a provider
Routes through Cloudflare's Unified API (`/compat`) for Workers AI and
Anthropic models, and through the provider-specific `/openai` subpath
for OpenAI models so reasoning models (gpt-5.x, o-series) can hit
`/v1/responses` natively. Once `/compat` adds Responses-API support,
the OpenAI subpath can be folded back in.
Catalog layout:
workers-ai/@cf/... -> openai-completions, gateway/.../compat
anthropic/... -> openai-completions, gateway/.../compat
<native-id> -> openai-responses, gateway/.../openai
(gpt-5.1, claude-... no, sorry: gpt-5.x and o-series only;
prefix stripped because the OpenAI SDK posts native ids)
Touches:
packages/ai/src/types.ts add cloudflare-ai-gateway to KnownProvider
packages/ai/src/env-api-keys.ts map to CLOUDFLARE_API_KEY
packages/ai/src/providers/cloudflare.ts add CLOUDFLARE_AI_GATEWAY_COMPAT_BASE_URL
and CLOUDFLARE_AI_GATEWAY_OPENAI_BASE_URL
packages/ai/src/providers/openai-responses.ts one-line dispatch through resolveCloudflareBaseUrl
(matches what openai-completions.ts already does)
packages/ai/scripts/generate-models.ts branch openai/* vs workers-ai/anthropic/*
packages/ai/src/models.generated.ts spliced 34 entries
packages/ai/test/stream.test.ts 3 e2e blocks (one per upstream)
packages/coding-agent/* defaultModelPerProvider, login, env docs,
README, providers.md
Verified end-to-end against a real Cloudflare account with unified
billing: 9/9 e2e tests pass across all three upstreams (Workers AI
Kimi K2.6, OpenAI gpt-5.1 reasoning, Anthropic claude-sonnet-4-5).
* refactor(ai): move AI Gateway User-Agent and per-route session-affinity flag to catalog
Mirrors the same per-model metadata refactor done for Workers AI in the
parent branch. All cloudflare-ai-gateway entries get the User-Agent
header. Only workers-ai/* gateway entries set
`compat.sendSessionAffinityHeaders: true` because the gateway
forwards that header to the underlying Workers AI runtime; anthropic/*
upstream and openai/* (openai-responses) don't use it.
packages/ai/scripts/generate-models.ts: emit headers (always) and
per-upstream compat (workers-ai only) on each cloudflare-ai-gateway
entry.
packages/ai/src/models.generated.ts: re-spliced 35 entries with
headers + conditional compat.
Behavior unchanged - 9/9 e2e tests pass across all three upstream
families.
* fix(ai): align AI Gateway with telemetry-aware UA helper
Adapts to badlogic/pi-mono#3851's follow-up fix ("honor telemetry for
Cloudflare attribution headers", fbb5eed) which moved the
'User-Agent: pi-coding-agent' header out of per-model catalog metadata
and into a centralized telemetry-honoring helper
(coding-agent/src/core/sdk.ts:getAttributionHeaders).
- packages/coding-agent/src/core/sdk.ts: extend the cloudflare branch of
getAttributionHeaders to also match cloudflare-ai-gateway and
gateway.ai.cloudflare.com.
- packages/ai/scripts/generate-models.ts and src/models.generated.ts:
drop 'headers' from the 35 cloudflare-ai-gateway entries (constant
CLOUDFLARE_STATIC_HEADERS no longer exists). Per-route
compat.sendSessionAffinityHeaders is unchanged.
End-to-end behavior unchanged: 9/9 tests still pass across all three
upstream families (Workers AI, Anthropic, OpenAI Responses).
---------
Co-authored-by: Mario Zechner <badlogicgames@gmail.com>
* Revert "fix(coding-agent): use alternate logic to find Bun's node_modules (#3861)"
This reverts commit c241c6d6d0. The logic
is faulty: the original strategy of looking for node_modules by asking
the package manager is not incorrect even on bun. Instead, it should
learn a different method of asking the package manager for node_modules
when the *package manager* is bun, not the *runtime*.
* feat(coding-agent): detect bun as package manager and use alternate root query
When `"npmCommand": ["bun"]` is configured in settings.json, pi fails to
start because it invokes `bun root -g`, which doesn't exist:
error: Failed to run bun root -g: error: Script not found "root"
Add a (simple) check for Bun being used as package manager, and instead
build the relative path starting from Bun's bin directory.
* feat(ai): add Cloudflare Workers AI as a provider
Cloudflare Workers AI hosts open-weight LLMs (Kimi K2.6, GPT-OSS,
GLM-4.7, Llama 4, Gemma 4, Nemotron 3) on Cloudflare's GPU network with
an OpenAI-compatible endpoint. Reuses the openai-completions API
protocol; the per-account URL contains a {CLOUDFLARE_ACCOUNT_ID}
placeholder resolved at request time by a small helper.
Pi automatically sets x-session-affinity for prefix caching:
https://developers.cloudflare.com/workers-ai/features/prompt-caching/
Auth: CLOUDFLARE_API_KEY (matches pi's *_API_KEY convention) +
CLOUDFLARE_ACCOUNT_ID. The User-Agent identifies traffic as
'pi-coding-agent' in Cloudflare analytics.
Verified end-to-end against a real Cloudflare account: 17 e2e tests
pass across stream/empty/tokens/unicode/tool-call-without-result/
total-tokens against @cf/moonshotai/kimi-k2.6.
Cloudflare AI Gateway is a separate, larger change (it requires routing
through provider-specific subpaths with the matching API protocol per
upstream) and will land in a follow-up PR.
* refactor(ai): move Cloudflare User-Agent and session-affinity flag to per-model metadata
Instead of conditionally setting them in openai-completions.ts based on
provider detection, declare them as model-level fields in the catalog
(headers + compat). This is consistent with how the github-copilot and
kimi-coding entries already declare their static headers.
packages/ai/scripts/generate-models.ts: emit headers and compat fields
on each cloudflare-workers-ai entry (CLOUDFLARE_STATIC_HEADERS).
packages/ai/src/providers/openai-completions.ts: drop the
isCloudflareProvider conditional that injected User-Agent and the
isCloudflareWorkersAI override of sendSessionAffinityHeaders.
packages/ai/src/models.generated.ts: re-spliced 8 cloudflare-workers-ai
entries with headers + compat.
Behavior is unchanged - verified via fetch interceptor that User-Agent
and x-session-affinity / session_id / x-client-request-id are still sent
on outbound requests. 5/5 e2e tests pass.