Commit graph

486 commits

Author SHA1 Message Date
Vlad Ionescu
7dcd422a73
opencode: Model updates (#57076)
**TL;DR**: clearer docs + models cleanup.

----

**Docs**: 
- as per the discussion in
https://github.com/zed-industries/zed/issues/56869, added a note to the
docs highlighting that temporary models should be configured using
Custom Models. Adding a whole example felt redundant considering the
full example is literally 2 rows below.


**Model updates**:
- **Ring 2.6 1T Free**: removed
- **GLM 5 and GLM 5.1**: different settings based on subscription —
[131k](8e710e19ea/providers/opencode/models/glm-5.1.toml (L22))
[output](8e710e19ea/providers/opencode/models/glm-5.toml (L22))
on OpenCode but
[32k](8e710e19ea/providers/opencode-go/models/glm-5.1.toml (L22))
[output](8e710e19ea/providers/opencode-go/models/glm-5.toml (L22))
on OpenCode Go
- **MiniMax M2.5**: different settings based on subscription — [131k
output on
OpenCode](8e710e19ea/providers/opencode/models/minimax-m2.5.toml (L22))
and [65k on OpenCode
Go](8e710e19ea/providers/opencode-go/models/minimax-m2.5.toml (L19))
- **Nemotron 3 Super Free**: enabled interleaved reasoning as per
[docs](8e710e19ea/providers/opencode/models/nemotron-3-super-free.toml (L13)).
Ran some quick tests and confirmed everything seems to work fine
(_"rename this variable. add a simple function. remove the function.
tell me a joke"_)
- **GPT 5.3 Codex Spark**: removed image support as per
[docs](https://github.com/anomalyco/models.dev/blob/dev/providers/opencode/models/gpt-5.3-codex-spark.toml#L23-L25)

The [docs say GLM 5 in OpenCode Zen has a deprecation date of May
14](https://opencode.ai/docs/zen/#deprecated-models) but that seems to
still be active and is [not marked as deprecated on
models.dev](8e710e19ea/providers/opencode/models/glm-5.toml)
so I didn't remove it yet 🤷

----

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes https://github.com/zed-industries/zed/issues/56869

Release Notes:
- OpenCode: updated models (removed Ring 2.6 1T Free, enabled
interleaved reasoning for Nemotron 3 Super Free, deleted incorrect image
support for GPT 5.3 Codex Spark, and updated token counts for MiniMax
M2.5, GLM 5, and GLM 5.1)
- OpenCode Free: clearer docs for temporary free models
2026-05-18 17:15:04 +00:00
Bennet Bo Fenner
68340172a1
agent: Remove unused LanguageModelImage APIs (#57050)
Pulled out from #56866. Will help with MCP image support

Release Notes:

- N/A
2026-05-18 12:22:22 +00:00
Bennet Bo Fenner
1dc07b40b9
copilot: Fix issue when switching between OpenAI and Anthropic models (#56655)
We were storing reasoning output inside `RedactedThinking` which causes
issues when switching mid-turn from an OpenAI to an Anthropic model.
This implementation fixes this by storing it inside `reasoning_details`,
which matches our responses implementation in `open_ai.rs`

See
https://github.com/microsoft/vscode-copilot-chat/blob/main/src/platform/endpoint/node/responsesApi.ts

For whatever reason the copilot chat extension sets `summary: []`, this
is what our implementation does too

Closes #56385

Release Notes:

- Fixed an issue where the agent would error when using Copilot as a
provider and switching between OpenAI and Anthropic models
2026-05-18 09:57:08 +00:00
Danilo Leal
fa838566c6
agent_ui: Adjust settings UI for some providers (#56879)
Namely, makes the ChatGPT subscription settings item a bit tidier, as
well as more consistent with the Copilot one.

Release Notes:

- N/A
2026-05-15 17:08:40 +00:00
Bennet Bo Fenner
3a742b5e0d
language_models: Remove unused cache_configuration API (#56884)
Release Notes:

- N/A
2026-05-15 16:27:11 +00:00
Jiby Jose
4e06b33eb9
Fix title generation when using openai models via subscription (#56858)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes: #56857 Title generation failing on agent threads with OpenAI
models from subscription

Currently title generation requests are made without any system prompt
which causes instructions to be not set, and the request fails. This
change makes sure there is always instructions set (api does not seem to
care if its empty or not)

Release Notes:

- N/A
2026-05-15 12:57:37 +00:00
Jiby Jose
e51b78db15
language_models: Support specifying reasoning effort with OpenAI subscription (#56849)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes: #56844 

Release Notes:

- Added support for specifying effort level when using OpenAI models via
ChatGPT subscription
2026-05-15 12:41:40 +00:00
morgankrey
37f6d7a15c
Add ChatGPT subscription provider via OAuth 2.0 PKCE (#53166)
Adds a new language model provider that lets users authenticate with
their ChatGPT Plus/Pro subscription and use OpenAI models
(codex-mini-latest, o4-mini, o3) directly in the Zed agent — without
needing a separate API key.

## How it works

1. **OAuth 2.0 + PKCE sign-in**: Uses OpenAI's official Codex CLI client
ID to run an authorization code flow. A local HTTP server on
`127.0.0.1:1455` captures the callback, exchanges the code for tokens,
and stores them in the system keychain.

2. **Token refresh**: Access tokens are automatically refreshed when
they're within 5 minutes of expiry, using the stored refresh token.

3. **Responses API**: Requests go to
`https://chatgpt.com/backend-api/codex/responses` using the existing
`open_ai::responses` client (Responses API format, not Chat Completions
which was deprecated for this endpoint in Feb 2026).

4. **Required headers**: `originator: zed`, `OpenAI-Beta:
responses=experimental`, `ChatGPT-Account-Id` (extracted from JWT),
`store: false` in the body.

## Files changed

- `crates/open_ai/src/responses.rs`: Add `store: Option<bool>` field to
`Request`; add `extra_headers` param to `stream_response` for
per-provider header injection
- `crates/language_models/src/provider/openai_subscribed.rs`: New
provider (sign-in UI, OAuth flow, token storage/refresh, model list)
- `crates/language_models/src/provider/open_ai.rs`,
`open_ai_compatible.rs`, `opencode.rs`: Pass `vec![]` for new
`extra_headers` param
- `crates/language_models/src/language_models.rs`: Register the new
provider
- `crates/language_models/Cargo.toml`: Add `rand` and `sha2` deps for
PKCE

## Open questions / known gaps

- [ ] **Terms of service**: Usage appears to be within OpenAI's ToS
(interactive use via their official CLI client ID), but needs legal
sign-off before shipping
- [ ] **Redirect URI**: Currently `http://localhost:1455/auth/callback`
— may need to match exactly what OpenAI's Codex CLI uses
- [ ] **UI polish**: The sign-in card is functional but minimal; needs
design review
- [ ] **Error messages**: OAuth error responses from the callback URL
aren't surfaced to the user yet
- [ ] **`o3` availability**: o3 may require a higher subscription tier;
consider gating it

## Testing

Sign-in flow was designed to match the Copilot Chat provider pattern.
Manual testing against the live OAuth endpoint is needed.

Release Notes:

- Added ChatGPT subscription provider, allowing users to use their
ChatGPT Plus/Pro subscription with the Zed agent

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
Co-authored-by: Richard Feldman <richard@zed.dev>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-05-14 21:03:56 +00:00
Bennet Bo Fenner
956dbea79a
copilot: Fix cache control error (#56632)
#56472 broke Copilot chat

> Failed to connect to API: 400 Bad Request {"message":"cache_control:
Extra inputs are not permitted"}

This PR makes it so that we still use the legacy caching approach for
Copilot

Release Notes:

- N/A
2026-05-13 13:34:47 +00:00
Richard Feldman
800a795545
bedrock: Add system-prompt cache anchor on caching-capable models (#56474)
The Bedrock Converse API supports placing `CachePoint` blocks inside the
`system` field, but we were sending the system prompt as a single
`SystemContentBlock::Text`, which leaves the system tokens dependent on
whatever message-level breakpoint happens to fall within the 20-block
lookback window.

This widens `bedrock::Request.system` from `Option<String>` to
`Vec<BedrockSystemContentBlock>` and has `into_bedrock` emit
`[Text(system), CachePoint(Default)]` whenever the model supports prompt
caching. The system prompt now anchors its own cache prefix, on top of
the existing tool-list anchor and per-message breakpoint, so a stable
system prompt keeps producing cache hits even when earlier conversation
turns change.

Bedrock does not support automatic caching or the 1-hour TTL, so the
default 5-minute ephemeral cache is the only option for this provider.

Release Notes:

- Improved Bedrock prompt cache utilization by anchoring the system
prompt as its own cache prefix
2026-05-12 22:46:29 +00:00
Conrad Irwin
54188321be
Fix token refresh for HTTP requests (#56559)
Code had been assuming (erroneously, but understandably) that
LlmApiToken::acquire would give them a valid token.

This is not true, as those tokens expire and you must call refresh
explicitly.

Add some helpers to do the retry for you, and rename acquire to cached
to be
clearer about the intent.

Closes #ISSUE

Release Notes:

- Fixed some rare cases where API requests would fail with Unauthorized
2026-05-12 19:40:00 +00:00
Ben Brandt
78c889c21d
open_ai: Responses API improvements (#56476)
Release Notes:

- Removed deprecated OpenAI models
- Added support for gpt-5.4-nano/mini models for OpenAI provider
- Improved output quality when using OpenAI models

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
Co-authored-by: Gaauwe Rombouts <mail@grombouts.nl>
2026-05-12 14:47:16 +00:00
Xiaobo Liu
166c53206e
deepseek: Add tool_choice field to DeepSeek request model (#56040)
Release Notes:

- N/A

reference: https://api-docs.deepseek.com/api/create-chat-completion/
2026-05-12 08:21:38 +00:00
Gunner Kwon
5d3f275cbf
bedrock: Add Guardrail configuration support (#50084)
## Background: Amazon Bedrock Guardrails

AWS Bedrock Guardrails enables configurable safety and compliance
controls for generative AI applications:
- Evaluates both user inputs and model responses against policies.
:contentReference[oaicite:8]{index=8}
- Can block or filter content based on harmful categories, denied
topics, PII, or hallucination criteria.
:contentReference[oaicite:9]{index=9}
- Guardrails are applied during inference API calls by specifying
`guardrailIdentifier` and `guardrailVersion` in the request.
:contentReference[oaicite:10]{index=10}

Relevant AWS documentation:
- User Guide:
https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
- How Guardrails Works:
https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-how.html
- API Reference (GuardrailConfiguration):
https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GuardrailConfiguration.html


Some AWS environments enforce IAM policies that require a guardrail to
be specified on every Bedrock API call (via a `StringEquals` condition
on `bedrock:GuardrailIdentifier`). Without this, Zed returns
`AccessDenied` and Bedrock models are completely unusable in those
environments.

This adds two optional settings, `guardrail_identifier` and
`guardrail_version`, to the Bedrock provider config. When set, a
`GuardrailStreamConfiguration` is attached to every `converse_stream`
request. When unset, behaviour is identical to before.

```json
{
  "language_models": {
    "bedrock": {
      "guardrail_identifier": "arn:aws:bedrock:us-east-1:123456789012:guardrail/abc123",
      "guardrail_version": "DRAFT"
    }
  }
}
```

`guardrail_version` defaults to `"DRAFT"` if omitted.

Release Notes:

- agent: Added `guardrail_identifier` and `guardrail_version` settings
for AWS Bedrock, enabling use in environments where IAM policies require
a guardrail on all model requests

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2026-05-11 17:43:54 +00:00
Bennet Bo Fenner
aeb05899b3
open_ai: Support specifying reasoning effort (#56411)
Closes #54875

Release Notes:

- Added support for specifying effort level when using OpenAI models
2026-05-11 13:48:33 +00:00
Bennet Bo Fenner
ee309a0000
anthropic: Dynamically fetch models from /models (#56397)
Most compelling reason to make this change is that we don't have to ship
a new Zed binary if Anthropic releases a new model

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- anthropic: Dynamically fetch available models from Anthropic API

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-05-11 13:19:20 +00:00
Bennet Bo Fenner
b3c65f9410
bedrock: Always use 1M context window for anthropic models (#56195)
Closes #49617

Release Notes:

- bedrock: Always use 1M context windows for Anthropic models
2026-05-08 15:43:52 +00:00
Teslim Olunlade
954ac0f3ca
Add conditional check for auto_discover in Ollama provider (#55999)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #55998

Release Notes:

- ollama: Fixed issue where specifying `auto_discover: false` would
still auto discover models
2026-05-08 14:17:59 +00:00
marius851000
65107c90b1
ollama: Fix thinking being also sent as content (#55540)
"msg.string_contents();" return more than just content. It also return
tool result (which need special handling, and should be emitted from the
tool role) and thinking (which need already implemented special
handling).

This resulted in thinking being sent as content, which apparently
ollama’s gemma4 parser didn’t handled well, and caused a number of issue
that manifested by 1. outputting end of though token where unappropriate
and 2. repeating things endlessly

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [ ] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
(no UI/UX impact)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable (a quick
review, without benchmark, show that this should be at least as fast as
the previous code. This is called only exceptionally anyway.)

Closes #55537

Release Notes:

- Fixed "thinking" text being badly formatted when sent to Ollama
2026-05-06 13:03:00 +00:00
Gabriel Linder
0dd5427e02
Cleanup and update mistral models based on their documentation (#55443)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [ ] Unsafe blocks (if any) have justifying comments
- [ ] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [ ] Performance impact has been considered and is acceptable

Release Notes:

- Mistral: Added Ministral 3 models and removed deprecated models

---------

Signed-off-by: Gabriel Linder <linder.gabriel@gmail.com>
2026-05-06 12:27:58 +00:00
Conrad Irwin
be705e677b
Merge gpui::Task and scheduler::Task (#53674)
Release Notes:

- N/A or Added/Fixed/Improved ...
2026-05-05 22:41:13 +00:00
Vlad Ionescu
a42374250f
opencode: Support interleaved_reasoning and fix DeepSeek (#55574)
OpenCode API endpoints for DeepSeek were [moved from
Anthropic-compatible to
OpenAI-compatible](https://github.com/anomalyco/opencode/pull/24500) and
DeepSeek requires interleaved reasoning enabled to work. I ran a
_"rename this variable to potato"_ test and I can confirm DeepSeek V4
Flash and Pro both work now 🎉

Some other OpenCode Go models were marked [on
models.dev](https://github.com/anomalyco/models.dev/tree/dev/providers/opencode-go/models)
as supporting `interleaved_reasoning` so they too got that enabled. Kimi
K2.5 and Kimi K2.6 continue to fail with
https://github.com/zed-industries/zed/issues/51743
(https://github.com/zed-industries/zed/pull/55085 seems to hint at this
being [an OpenCode
issue](https://github.com/zed-industries/zed/issues/51743#issuecomment-4336785765)?),
but all other models seem to work fine both with `interleaved_reasoning`
and without it 🤷 I assume it's better to have that turned on? Again, the
intersection of OpenAI Chat Completions API, different models, different
inference providers, how they all work together is something I know
nothing about!

Self-Review Checklist:

- [X] I've reviewed my own diff for quality, security, and reliability
- [X] Unsafe blocks (if any) have justifying comments
- [X] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [X] Performance impact has been considered and is acceptable

Release Notes:
- OpenCode Go: use correct DeepSeek endpoints
- OpenCode: add support for interleaved_reasoning
2026-05-05 14:30:17 +00:00
Bennet Bo Fenner
038e2136fb
cloud: Fix incorrect model getting selected at startup (#55325)
Follow up to #54826, after which the fallback model would be selected
instead of the cloud model when starting Zed

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-05-04 07:52:37 +00:00
Bennet Bo Fenner
2985e058c3
Remove v0 provider (#55177)
Removes the Vercel v0 Provider, as the v0 API has been
depredated/removed (https://api.v0.dev/v1)

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- agent: Removed Vercel v0 provider as it has been deprecated by Vercel
2026-04-29 10:06:41 +00:00
Vlad Ionescu
5eec0cecc1
opencode: Model updates + thinking levels (#54880)
Leftovers after https://github.com/zed-industries/zed/pull/53651

Updated models:
- **Zen**: GPT 5.5 and GPT 5.5 Pro - not tested because I don't have a
Zen subscription and I stubbornly refuse to get one
- **Go**: DeepSeek V4 Pro and DeepSeek V4 Flash - failing due to
https://github.com/anomalyco/opencode/issues/24224
- **Go**: MiMo V2.5 and MiMo V2.5 Pro - tested, confirmed working
- **Free**: Ling 2.6 Flash, [available for "a limited
time"](https://x.com/opencode/status/2046717718028513694) - tested,
confirmed working
- **Free**: Hy3 Preview, [available until May
8](https://x.com/opencode/status/2047328981435756824) - tested,
confirmed working

When testing the new models and comparing with OpenCode CLI, I realized
the
[variants](https://opencode.ai/docs/models/#built-in-variants)/thinking
effort configuration was not supported by the Zed Agent implementation.
I added that for OpenCode Go models, after manually checking what each
model supports in OpenCode. Reasoning levels and everything seems to
work (UI looks good, model works) but I have no idea how to specifically
test reasoning levels without doing a full benchmark 🤷
I did not add the same thing for OpenCode Zen models because I could not
figure out a way to get the supported variants.
@benbrandt let me know if you want me to take this out of this PR and to
keep just the model updates!

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:
- OpenCode: add new models (GPT 5.5, DeepSeek V4, MiMo V2.5, Ling 2.6,
Hy3)
- OpenCode Go: add support for configurable reasoning effort levels

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-04-28 16:00:27 +00:00
Xiaobo Liu
4c27cee963
deepseek: Add deepseek-v4-pro & deepseek-v4-flash (#54731)
reference: https://api-docs.deepseek.com/

Release Notes:

- Added deepseek-v4-pro and deepseek-v4-flash models

---------

Signed-off-by: Xiaobo Liu <cppcoffee@gmail.com>
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
Co-authored-by: MrSubidubi <dev@bahn.sh>
2026-04-27 13:50:00 +00:00
Bennet Bo Fenner
bf3fc2336d
agent: Allow tools to output multiple content parts (#54518)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- N/A
2026-04-27 12:36:11 +00:00
Bennet Bo Fenner
83cc2ec054
open_ai: Use responses API for all models (#54910)
From the
[docs](https://developers.openai.com/api/docs/guides/migrate-to-responses#responses-benefits):

> Better performance: Using reasoning models, like GPT-5, with Responses
will result in better model intelligence when compared to Chat
Completions. Our internal evals reveal a 3% improvement in SWE-bench
with same prompt and setup.
Agentic by default: The Responses API is an agentic loop, allowing the
model to call multiple tools, like web_search, image_generation,
file_search, code_interpreter, remote MCP servers, as well as your own
custom functions, within the span of one API request.
Lower costs: Results in lower costs due to improved cache utilization
(40% to 80% improvement when compared to Chat Completions in internal
tests).
Stateful context: Use store: true to maintain state from turn to turn,
preserving reasoning and tool context from turn-to-turn.
Flexible inputs: Pass a string with input or a list of messages; use
instructions for system-level guidance.
Encrypted reasoning: Opt-out of statefulness while still benefiting from
advanced reasoning.
Future-proof: Future-proofed for upcoming models.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- Always use Responses API for OpenAI models
2026-04-27 09:52:51 +00:00
Marshall Bowers
2cc960c991
language_models: Fix is_authenticated state for Cloud provider (#54826)
This PR fixes an issue introduced in
https://github.com/zed-industries/zed/pull/54397 where the Zed Cloud
provider would not be reflected as "authenticated" if a connection to
Collab was attempted, but could not be established.

This was especially noticable when running Zed against a local version
of Cloud and not having Collab running.

This restores the original logic prior to that change.

Release Notes:

- N/A
2026-04-27 08:59:36 +00:00
Bennet Bo Fenner
db1d79a5ca
open_ai: Add support for gpt-5.5 (#54820)
Release Notes:

- Added support for GPT 5.5 and GPT 5.5 Pro via the OpenAI provider
2026-04-24 20:26:56 +00:00
Vlad Ionescu
f59c122bee
opencode: Add support for OpenCode Go (#53651)
**TL;DR**: add support for OpenCode Go (flat-rate monthly subscription)
along the already-implemented OpenCode Zed (pay-as-you-go billing).

> [!WARNING]
> This code was written by LLMs, under the supervision of a so-called
developer that never wrote Rust profesionally and that spends more time
in Pages&Keynote than in an IDE.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

## Background

OpenCode offers a few different ways to access models:
- **free access to 3 models**, with feedback and data used to improve
the model. These models use the OpenCode Zen API endpoints, but have
different usage limits (200 requests per 5 hours) and have a different
privacy policy. Some people disable or block the free models, some
people are super-excited to have access to LLMs for free, and some
people like using the free models to test new LLMs (at launch MiMo-V2
had 2 free weeks of usage, for example).
- **pay-as-you-go access to 30 models** as part of the [OpenCode
Zen](https://opencode.ai/zen) subscription. These models use the same
OpenCode Zen API endpoints.
- **flat-rate monthly access to 7 models** as part of the [OpenCode
Go](https://opencode.ai/go) subscription. These models use the OpenCode
Zen API endpoints with an extra `/go` appended to the path. There are
5-hour, weekly, and monthly usage limits and, additionally, users can
toggle a switch in the OpenCode Console to use Zen models with their
pay-as-you-go billing after the Go limits are hit.

There's also a currently-paused [OpenCode
Black](https://opencode.ai/black) flat-rate subscription with way higher
usage limits and with access to more models, with $100 and $200 monthly
plans.

The whole thing is a bit messy, but it's great value and highly reliable
LLM access!

<br>

https://github.com/zed-industries/zed/pull/49589 added support for
OpenCode Zen by implementing a new `opencode` provider. OpenCode Go
[could be used by overriding the API
URL](https://github.com/zed-industries/zed/pull/49589#issuecomment-4130300454),
but that is a terrible user experience: some models have to be manually
added, the model list always shows the 30-something OpenCode Zen models,
and free models cannot be used at all.

I was annoyed by the experience of using OpenCode Go with Zed and this
past week I had to test a bunch of LLMs and providers and harnesses, so
I took this on as a test case 🙂

## Implementation

This PR makes the OpenCode provider more general (not just for Zen) and
adds an `OpenCodeModelSubscription` concept which is then used to
implement support for OpenCode Go. The free models are also broken out
into their own subscription for a prettier model list.

For a better user experience, the different subscriptions can be enabled
or disabled, both in the settings file and in the UX:
<img width="434" height="176" alt="Screenshot showing the OpenCode
provider configuration, with the newly added toggles"
src="https://github.com/user-attachments/assets/3520e114-00c8-4794-84bf-35cd72d9c57e"
/>


The code was written by LLMs, but I do understand it and I did a bunch
of "manual" iterations and "manual" tweaks. Still, my Rust experience is
non-existent so **I won't feel offended if y'all reject this PR**! I did
consider alternatives (adding a new `opencode-go` provider and renaming
this to `opencode-zen`, for example, or adding support for custom API
URLs in OpenCode custom models which would've been the smallest code
change but a terrible user experience, and so on) but all alternatives
would have been, in my opinion, a worse user experience.

**Tests I did**:
- confirmed OpenCode Go models work as expected
- confirmed OpenCode Zen Free models work as expected
- confirmed I get an error when trying to use OpenCode Zen models since
I don't have that subscription
- confirmed the subcription toggles work as expected (model are
shown/hidden, settings file is updated)

**Notes**:
- this PR is best reviewed commit-by-commit. I did not create a separate
PR for the model updates to minimize delays
- my exeprience with Rust is roughly zero, but I tried to strike a
balance between idiomatic Rust and easy-to-read code
- users of the OpenCode provider might have to do some re-configuration
after this PR is merged since the model identifiers now include the
subscription, eg `claude-haiku-4-5` is now `zen/claude-haiku-4-5`. Since
this is a relatively new provider and the impact is small, I preffered
that rather than adding complex migration/mapping logic.
- does changing the provider name from "OpenCode Zen" to "OpenCode"
break anything for y'all at Zed?
- does changing the telemetry id from `"opencode/<model-id>"` to
`"opencode/<subscription>/<model-id>"` break anything for y'all at Zed?

---

Release Notes:
- OpenCode provider: add support for OpenCode Go

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-04-24 14:17:30 +00:00
Matt Van Horn
adab7b8871
language_models: Honor images capability for custom OpenAI models (#54223)
## Summary

Users who add custom OpenAI models under
`language_models.openai.available_models` can set `capabilities.images:
true` to declare that the endpoint accepts image inputs. Today, that
setting is silently ignored: the Agent panel's image-attach button stays
disabled regardless, and the only workaround is to switch to a built-in
OpenAI model, attach the image, and switch back.

Root cause: `Model::Custom` does not carry a `supports_images` field,
and the OpenAI provider's `supports_images()` for the `Custom` arm
hardcodes `false`.

## Changes

1. `crates/settings_content/src/language_model.rs`: add `images: bool`
to `OpenAiModelCapabilities` with `#[serde(default)]` so existing
settings.json files keep working unchanged.
2. `crates/open_ai/src/open_ai.rs`: add `supports_images: bool` to
`Model::Custom` with a matching serde default.
3. `crates/language_models/src/provider/open_ai.rs`: pass
`model.capabilities.images` into the `Model::Custom` variant in
`provided_models`, and return the stored value from `supports_images()`
for `Custom`.

Existing `Model::Custom { .. }` match sites (`completion.rs:829`,
various in `open_ai.rs`) all use `..` so they continue to compile
without change.

## Testing

- `cargo check -p settings_content -p open_ai -p language_models`:
clean.
- I was not able to complete `./script/clippy` locally: the build
stalled on the first-time `webrtc-sys` download for livekit-rust-sdks
(TLS close_notify failure on docs.rs mirror). Happy to rerun once CI has
cached artifacts.
- Manually verified the capability plumbing by tracing: settings.json ->
`OpenAiModelCapabilities.images` -> `Model::Custom { supports_images }`
-> `supports_images()` -> `Thread::prompt_capabilities` ->
`SessionCapabilities.supports_images()` -> `build_add_context_menu` gate
in `thread_view.rs`.

## Related Issues

Closes #50752

Release Notes:

- Fixed custom OpenAI models ignoring the `capabilities.images` setting
in `language_models.openai.available_models`.

This contribution was developed with AI assistance (Codex).

---------

Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
2026-04-24 10:36:46 +00:00
Lukas Wirth
c5a2807492
Remove smol as a dependency from a bunch of crates (#53603)
We aren't making use of it in these crates and it unblocks some
web-related work

Release Notes:

- N/A or Added/Fixed/Improved ...
2026-04-24 10:29:51 +00:00
Finn Evers
9b40411c6a
Fix bad GitHub merge queue merge (#54721)
No, sadly, the title is not a typo. See
https://www.githubstatus.com/incidents/zsg1lk7w13cf for the context.
I'll read with joy and popcorn through that root cause analysis.

It makes literally zero sense what happened here, but for some completly
bonkers reason GitHub completely messed up the merge queue with
https://github.com/zed-industries/zed/pull/54632.

I have no idea how it happened. It makes literally zero sense. A PR
going into the merge queue should have the same LoC when getting out of
it. GitHub obviously does not check this. GitHub causes extra work with
a feature that is supposed to save time.

Thanks, I guess.

Release Notes:

- N/A

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2026-04-23 23:47:30 +00:00
Danilo Leal
0ab64d6414
branch_picker: Add button to filter remote branches (#54632)
This PR brings back the button to filter remote branches when accessing
the title bar's branch picker with the mouse. It was unintentionally
removed when we introduced the new worktree picker.

Release Notes:

- N/A
2026-04-23 18:26:44 +00:00
Eric Holk
4a5fbf6a3f
Request summarized thinking for Claude Opus 4.7 (#54217)
Starting with Claude Opus 4.7, Anthropic omits thinking content from
responses by default; callers must pass `display: "summarized"` to keep
seeing thinking summaries. Without opting in, the agent UI shows a long
pause with no visible thinking, and users get no progress indication
during extended reasoning.

This extends the adaptive-thinking wire type with an optional `display`
field and requests `Summarized` from every call site that builds an
adaptive thinking request (direct Anthropic, Copilot Chat proxy, Zed
Cloud, and Bedrock).

## Notes

- Applied at the adaptive-thinking layer rather than special-casing Opus
4.7. The `display` parameter is accepted by every
adaptive-thinking-capable model, and the previous behavior (visible
summaries) is what users already see on Opus 4.6 / Sonnet 4.6, so there
is no behavior change for those models.

Release Notes:

- Restored thinking summaries for Claude Opus 4.7.
2026-04-23 15:43:02 +00:00
Ben Brandt
2eafa6e6aa
language_models: Remove unused language model token counting (#54177)
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.

I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.

Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-04-22 13:39:48 +00:00
Guilherme do Amaral Alves
7b082cbb6f
Add interleaved_reasoning option to openai compatible models (#54016)
Release Notes:

- Added interleaved_reasoning option to openai compatible models

---

This PR adds the interleaved_reasoning option for OpenAI-compatible
models, addressing the issue described in
https://github.com/ggml-org/llama.cpp/issues/20837.

In my testing, enabling interleaved_reasoning not only resolved the
tool-calling issues encountered by Qwen3.5 models in llama.cpp, but also
appeared to improve the model's coding capabilities. I have also
verified the outgoing requests using a proxy to ensure the parameter is
being sent correctly.It is also likely that this change will benefit
other models and providers as well.

Note: While I used AI to assist with the implementation, I have reviewed
and tested the changes. As I am relatively new to Rust and the Zed
codebase, I would appreciate any feedback or suggestions for
improvement. I am happy to make further adjustments if needed.

Thank you all for building such an amazing editor!

Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>
2026-04-22 10:40:37 +00:00
Smit Barmase
51fc26da22
Fix agent default model not picking up after authentication resolve (#54397)
Regression in https://github.com/zed-industries/zed/pull/54125

Release Notes:

- agent: Fixed an issue where the default Zed model would not get
selected after sign-in completed

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-04-21 10:49:47 +00:00
Tom Houlé
57e01b3701
language_models: Fix misleading copy when hosted models are disabled (#53971)
The agent settings UI unconditionally showed "You have access to Zed's
hosted models through your Organization" for all Business plan users,
even when the org admin had turned off the Zed AI model provider. Now
the copy reads "Zed's hosted models are disabled by your organization's
configuration" when is_zed_model_provider_enabled is false.

Also added component preview entries for both Business plan states.

Closes CLO-667

Release Notes:

- N/A

---------

Co-authored-by: Neel <neel@zed.dev>
2026-04-20 19:57:45 +02:00
Abhishek Tripathi
1115f768c9
copilot: Wire up reasoning tokens for GPT models (#53313)
Fix two issues with reasoning support in the Copilot provider:

- Responses API path: use the user's thinking_effort setting instead of
hardcoding Medium effort
- Chat Completions path: compute and pass thinking_budget when thinking
is enabled, instead of unconditionally setting it to None

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [ ] Unsafe blocks (if any) have justifying comments
- [ ] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [ ] Performance impact has been considered and is acceptable

Closes #52140

Release Notes:

- Fixed a bug where copilot wouldn't use the thinking level the user's
have set

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>
2026-04-20 14:33:18 +02:00
Richard Feldman
76aab0c35c
Fix RefCell panic in cloud model token counting (#54188)
Fixes #54140

When `RulesLibrary::count_tokens` calls
`CloudLanguageModel::count_tokens` for Google cloud models, it does so
inside a `cx.update` closure, which holds a mutable borrow on the global
`AppCell`. The Google provider branch then called
`token_provider.auth_context(&cx.to_async())`, which created a new
`AsyncApp` handle and tried to take a shared borrow on the same
`RefCell` — causing a "RefCell already mutably borrowed" panic.

This only affects Google models because they are the only provider that
counts tokens server-side via an HTTP request (requiring
authentication). The other providers (Anthropic, OpenAI, xAI) count
tokens locally using tiktoken, so they never call `auth_context` during
`count_tokens`.

The fix makes `CloudLlmTokenProvider::auth_context` generic over `impl
AppContext` instead of requiring `&AsyncApp`. This allows the
`count_tokens` call site to pass `&App` directly (which reads entities
without re-borrowing the `RefCell`), while all other call sites that
already pass `&AsyncApp` (e.g. `stream_completion`, `refresh_models`)
continue to work unchanged.

Release Notes:

- Fixed a crash ("RefCell already mutably borrowed") that could occur
when counting tokens with Google cloud language models.
2026-04-17 11:03:46 -04:00
Richard Feldman
87b47a4b08
Add Claude Opus 4.7 BYOK (#54077)
<img width="767" height="428" alt="Screenshot 2026-04-16 at 11 29 13 AM"
src="https://github.com/user-attachments/assets/e8b450fa-aefc-4dec-a286-b211bd492011"
/>

Add Claude Opus 4.7 (`claude-opus-4-7`) to the anthropic, bedrock, and
opencode provider crates.

Key specs:
- 1M token context window
- 128k max output tokens
- Adaptive thinking support
- AWS Bedrock cross-region inference (global, US, EU, AU)

Release Notes:

- Added Claude Opus 4.7 as an available language model
2026-04-17 10:51:32 -04:00
Anthony Eid
e92a40a9d8
agent: Auto-select user model when there's no default (#54125)
Reimplements #36722 while fixing the race that required the revert in
#36932.

When no default model is configured, this picks an environment fallback
by authenticating all providers. It always prefers the Zed cloud
provider when it's authenticated, and waits for its models to load
before picking another provider as the fallback, so we don't flicker
from Zed models to Anthropic while sign-in is in flight.

The fallback is recomputed whenever provider state changes (via
`ProviderStateChanged`/`AddedProvider`/`RemovedProvider` events), so the
selection becomes correct as soon as cloud models arrive.

### What changed vs. the original PR

- `language_models::init` now owns `authenticate_all_providers`
(previously done in `LanguageModelPickerDelegate` and `agent`'s
`LanguageModels`).
- After all authentications settle, and on any subsequent provider state
change, `update_environment_fallback_model` recomputes the fallback.
- The fallback logic prefers Zed cloud: if the cloud provider is
authenticated, only use it (waiting for its models to load). Otherwise,
fall through to the first authenticated provider with a default or
recommended model.
- `LanguageModelRegistry::default_model()` falls back to
`environment_fallback_model` when no explicit default is set.
- Existing `Thread`s that are empty are updated to the new default when
`DefaultModelChanged` fires, so a blank thread started before sign-in
switches to Zed models once the user signs in.

Release Notes:

- agent: Automatically select a model when there's no selected model or configured default
2026-04-16 18:35:47 -04:00
Ben Brandt
223e4ae0e9
copilot_chat: Set Copilot output config only when effort exists (#54103)
It seems their verification got stricter, at least stricter than
Anthropics. Only set the output config if we have an effort.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #54036

Release Notes:

- copilot_chat: Fix invalid reasoning effort for some models.

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-16 17:47:05 +00:00
Smit Barmase
bbf087756c
language_models: Fix Mistral tool use erroring out (#54058)
Closes #52717

Mistral's streaming API sometimes sends `"id": "null"` (the literal
string, not JSON null) in continuation chunks for tool calls. Our stream
handler treated this as a valid ID, overwriting the real tool call ID
from the first chunk. When the corrupted ID was sent back in the next
request, Mistral's API rejected it with "Tool call id has to be defined
in serving mode."

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- Fixed issue where Mistral models erroring out with "Tool call id has
to be defined" when using tools.
2026-04-16 18:21:02 +05:30
Agus Zubiaga
98c17ca160
language_models: Refactor deps and extract cloud (#53270)
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.


Release Notes:

- N/A

---------

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-07 12:28:19 -03:00
Vimsucks
49ebe4bd61
Add reasoning_effort field to OpenAI compatible model configuration (#50582)
Some model like glm-5、kimi-k2.5 support reasoning, but require
reasoning_effort parameter

This pr add support for setting reasoing_effort for openai compatible
models

Tested using the following config:

```json
{
  "language_models": {
    "openai_compatible": {
      "My LiteLLM": {
        "available_models": [
          {
            "name": "glm-5",
            "display_name": "glm-5",
            "max_tokens": 73728,
            "reasoning_effort": "low"
          },
          {
            "name": "kimi-k2.5",
            "display_name": "kimi-k2.5",
            "max_tokens": 262144,
            "reasoning_effort": "low"
          }
        ]
      }
    }
  }
}

```

Release Notes:

- Added a setting to control `reasoning_effort` in custom
OpenAI-compatible models
2026-04-04 19:22:04 +00:00
Jakub Konka
29609d3599
language_model: Decouple from Zed-specific implementation details (#52913)
This PR decouples `language_model`'s dependence on Zed-specific
implementation details. In particular
* `credentials_provider` is split into a generic `credentials_provider`
crate that provides a trait, and `zed_credentials_provider` that
implements the said trait for Zed-specific providers and has functions
that can populate a global state with them
* `zed_env_vars` is split into a generic `env_var` crate that provides
generic tooling for managing env vars, and `zed_env_vars` that contains
Zed-specific statics
* `client` is now dependent on `language_model` and not vice versa

Release Notes:

- N/A
2026-04-02 17:06:57 -03:00
Jakub Konka
6663a60876
language_model: Refactor crate structure and dependencies (#52857)
A couple of things that this PR wants to accomplish:
* remove dependency on `settings` crate from `language_model`
* refactor provider-specific code into submodules - to be honest, I
would go one step further and put all provider-specific bits in
`language_models` instead but I realise we have cloud logic in
`language_model` which uses those too making it tricky
* move anthropic-specific telemetry into `language_models` crate - I
think it makes more sense for it to be there

Anyhow, I would very appreciate if you could have a look @mikayla-maki
and @maxdeviant and lemme know what you think, if you would tweak
something, etc.

Release Notes:

- N/A
2026-04-01 17:44:25 +02:00