Commit graph

13 commits

Author SHA1 Message Date
Bennet Bo Fenner
3a742b5e0d
language_models: Remove unused cache_configuration API (#56884)
Release Notes:

- N/A
2026-05-15 16:27:11 +00:00
Bennet Bo Fenner
956dbea79a
copilot: Fix cache control error (#56632)
#56472 broke Copilot chat

> Failed to connect to API: 400 Bad Request {"message":"cache_control:
Extra inputs are not permitted"}

This PR makes it so that we still use the legacy caching approach for
Copilot

Release Notes:

- N/A
2026-05-13 13:34:47 +00:00
Conrad Irwin
54188321be
Fix token refresh for HTTP requests (#56559)
Code had been assuming (erroneously, but understandably) that
LlmApiToken::acquire would give them a valid token.

This is not true, as those tokens expire and you must call refresh
explicitly.

Add some helpers to do the retry for you, and rename acquire to cached
to be
clearer about the intent.

Closes #ISSUE

Release Notes:

- Fixed some rare cases where API requests would fail with Unauthorized
2026-05-12 19:40:00 +00:00
Ben Brandt
78c889c21d
open_ai: Responses API improvements (#56476)
Release Notes:

- Removed deprecated OpenAI models
- Added support for gpt-5.4-nano/mini models for OpenAI provider
- Improved output quality when using OpenAI models

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
Co-authored-by: Gaauwe Rombouts <mail@grombouts.nl>
2026-05-12 14:47:16 +00:00
Lukas Wirth
c5a2807492
Remove smol as a dependency from a bunch of crates (#53603)
We aren't making use of it in these crates and it unblocks some
web-related work

Release Notes:

- N/A or Added/Fixed/Improved ...
2026-04-24 10:29:51 +00:00
Finn Evers
9b40411c6a
Fix bad GitHub merge queue merge (#54721)
No, sadly, the title is not a typo. See
https://www.githubstatus.com/incidents/zsg1lk7w13cf for the context.
I'll read with joy and popcorn through that root cause analysis.

It makes literally zero sense what happened here, but for some completly
bonkers reason GitHub completely messed up the merge queue with
https://github.com/zed-industries/zed/pull/54632.

I have no idea how it happened. It makes literally zero sense. A PR
going into the merge queue should have the same LoC when getting out of
it. GitHub obviously does not check this. GitHub causes extra work with
a feature that is supposed to save time.

Thanks, I guess.

Release Notes:

- N/A

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2026-04-23 23:47:30 +00:00
Danilo Leal
0ab64d6414
branch_picker: Add button to filter remote branches (#54632)
This PR brings back the button to filter remote branches when accessing
the title bar's branch picker with the mouse. It was unintentionally
removed when we introduced the new worktree picker.

Release Notes:

- N/A
2026-04-23 18:26:44 +00:00
Eric Holk
4a5fbf6a3f
Request summarized thinking for Claude Opus 4.7 (#54217)
Starting with Claude Opus 4.7, Anthropic omits thinking content from
responses by default; callers must pass `display: "summarized"` to keep
seeing thinking summaries. Without opting in, the agent UI shows a long
pause with no visible thinking, and users get no progress indication
during extended reasoning.

This extends the adaptive-thinking wire type with an optional `display`
field and requests `Summarized` from every call site that builds an
adaptive thinking request (direct Anthropic, Copilot Chat proxy, Zed
Cloud, and Bedrock).

## Notes

- Applied at the adaptive-thinking layer rather than special-casing Opus
4.7. The `display` parameter is accepted by every
adaptive-thinking-capable model, and the previous behavior (visible
summaries) is what users already see on Opus 4.6 / Sonnet 4.6, so there
is no behavior change for those models.

Release Notes:

- Restored thinking summaries for Claude Opus 4.7.
2026-04-23 15:43:02 +00:00
Ben Brandt
2eafa6e6aa
language_models: Remove unused language model token counting (#54177)
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.

I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.

Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-04-22 13:39:48 +00:00
Guilherme do Amaral Alves
7b082cbb6f
Add interleaved_reasoning option to openai compatible models (#54016)
Release Notes:

- Added interleaved_reasoning option to openai compatible models

---

This PR adds the interleaved_reasoning option for OpenAI-compatible
models, addressing the issue described in
https://github.com/ggml-org/llama.cpp/issues/20837.

In my testing, enabling interleaved_reasoning not only resolved the
tool-calling issues encountered by Qwen3.5 models in llama.cpp, but also
appeared to improve the model's coding capabilities. I have also
verified the outgoing requests using a proxy to ensure the parameter is
being sent correctly.It is also likely that this change will benefit
other models and providers as well.

Note: While I used AI to assist with the implementation, I have reviewed
and tested the changes. As I am relatively new to Rust and the Zed
codebase, I would appreciate any feedback or suggestions for
improvement. I am happy to make further adjustments if needed.

Thank you all for building such an amazing editor!

Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>
2026-04-22 10:40:37 +00:00
Richard Feldman
76aab0c35c
Fix RefCell panic in cloud model token counting (#54188)
Fixes #54140

When `RulesLibrary::count_tokens` calls
`CloudLanguageModel::count_tokens` for Google cloud models, it does so
inside a `cx.update` closure, which holds a mutable borrow on the global
`AppCell`. The Google provider branch then called
`token_provider.auth_context(&cx.to_async())`, which created a new
`AsyncApp` handle and tried to take a shared borrow on the same
`RefCell` — causing a "RefCell already mutably borrowed" panic.

This only affects Google models because they are the only provider that
counts tokens server-side via an HTTP request (requiring
authentication). The other providers (Anthropic, OpenAI, xAI) count
tokens locally using tiktoken, so they never call `auth_context` during
`count_tokens`.

The fix makes `CloudLlmTokenProvider::auth_context` generic over `impl
AppContext` instead of requiring `&AsyncApp`. This allows the
`count_tokens` call site to pass `&App` directly (which reads entities
without re-borrowing the `RefCell`), while all other call sites that
already pass `&AsyncApp` (e.g. `stream_completion`, `refresh_models`)
continue to work unchanged.

Release Notes:

- Fixed a crash ("RefCell already mutably borrowed") that could occur
when counting tokens with Google cloud language models.
2026-04-17 11:03:46 -04:00
Richard Feldman
87b47a4b08
Add Claude Opus 4.7 BYOK (#54077)
<img width="767" height="428" alt="Screenshot 2026-04-16 at 11 29 13 AM"
src="https://github.com/user-attachments/assets/e8b450fa-aefc-4dec-a286-b211bd492011"
/>

Add Claude Opus 4.7 (`claude-opus-4-7`) to the anthropic, bedrock, and
opencode provider crates.

Key specs:
- 1M token context window
- 128k max output tokens
- Adaptive thinking support
- AWS Bedrock cross-region inference (global, US, EU, AU)

Release Notes:

- Added Claude Opus 4.7 as an available language model
2026-04-17 10:51:32 -04:00
Agus Zubiaga
98c17ca160
language_models: Refactor deps and extract cloud (#53270)
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.


Release Notes:

- N/A

---------

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-07 12:28:19 -03:00