Commit graph

122 commits

Author SHA1 Message Date
David Wu
ebed7d39a9
agent: Add Opus 4.7 speed support (#56701)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-05-15 13:57:33 +00:00
Bennet Bo Fenner
956dbea79a
copilot: Fix cache control error (#56632)
#56472 broke Copilot chat

> Failed to connect to API: 400 Bad Request {"message":"cache_control:
Extra inputs are not permitted"}

This PR makes it so that we still use the legacy caching approach for
Copilot

Release Notes:

- N/A
2026-05-13 13:34:47 +00:00
Richard Feldman
ad916ca1af
anthropic: Use automatic prompt caching with long-lived anchors on tools and system (#56472)
Switches the Anthropic provider from hand-stamping `cache_control` onto
the last message content block over to Anthropic's top-level automatic
prompt caching, paired with explicit long-TTL (1h) anchors on the last
tool definition and on the system prompt.

The prefix order `tools` → `system` → `messages` satisfies Anthropic's
requirement that longer TTLs appear earlier in the prefix, so the static
prefix is cached for 1h (surviving idle gaps longer than the 5-minute
default) while the rapidly-changing conversation tail uses the
free-to-refresh 5-minute TTL via the top-level automatic breakpoint.
Three of the four available cache breakpoints are used (last tool,
system, automatic conversation), leaving one in reserve.

As a side benefit, this fixes a latent issue where the previous stamping
loop could place `cache_control` on a `Thinking` content block, which
the Anthropic API does not allow. Automatic caching is documented to
walk past ineligible blocks (including thinking) when selecting its
breakpoint, so we now delegate that responsibility to the server.

The new shape we send (when caching is enabled):

```json
{
  "tools": [{ "...": "...", "cache_control": {"type": "ephemeral", "ttl": "1h"} }],
  "system": [
    {"type": "text", "text": "...", "cache_control": {"type": "ephemeral", "ttl": "1h"}}
  ],
  "messages": [ /* no per-block cache_control */ ],
  "cache_control": {"type": "ephemeral"}
}
```

Release Notes:

- Improved Anthropic prompt cache utilization, reducing latency and cost
for ongoing conversations

---------

Co-authored-by: Martin Ye <martinye022@gmail.com>
2026-05-12 21:09:39 +00:00
Bennet Bo Fenner
ee309a0000
anthropic: Dynamically fetch models from /models (#56397)
Most compelling reason to make this change is that we don't have to ship
a new Zed binary if Anthropic releases a new model

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- anthropic: Dynamically fetch available models from Anthropic API

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-05-11 13:19:20 +00:00
Bennet Bo Fenner
bf3fc2336d
agent: Allow tools to output multiple content parts (#54518)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- N/A
2026-04-27 12:36:11 +00:00
Finn Evers
9b40411c6a
Fix bad GitHub merge queue merge (#54721)
No, sadly, the title is not a typo. See
https://www.githubstatus.com/incidents/zsg1lk7w13cf for the context.
I'll read with joy and popcorn through that root cause analysis.

It makes literally zero sense what happened here, but for some completly
bonkers reason GitHub completely messed up the merge queue with
https://github.com/zed-industries/zed/pull/54632.

I have no idea how it happened. It makes literally zero sense. A PR
going into the merge queue should have the same LoC when getting out of
it. GitHub obviously does not check this. GitHub causes extra work with
a feature that is supposed to save time.

Thanks, I guess.

Release Notes:

- N/A

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2026-04-23 23:47:30 +00:00
Danilo Leal
0ab64d6414
branch_picker: Add button to filter remote branches (#54632)
This PR brings back the button to filter remote branches when accessing
the title bar's branch picker with the mouse. It was unintentionally
removed when we introduced the new worktree picker.

Release Notes:

- N/A
2026-04-23 18:26:44 +00:00
Eric Holk
4a5fbf6a3f
Request summarized thinking for Claude Opus 4.7 (#54217)
Starting with Claude Opus 4.7, Anthropic omits thinking content from
responses by default; callers must pass `display: "summarized"` to keep
seeing thinking summaries. Without opting in, the agent UI shows a long
pause with no visible thinking, and users get no progress indication
during extended reasoning.

This extends the adaptive-thinking wire type with an optional `display`
field and requests `Summarized` from every call site that builds an
adaptive thinking request (direct Anthropic, Copilot Chat proxy, Zed
Cloud, and Bedrock).

## Notes

- Applied at the adaptive-thinking layer rather than special-casing Opus
4.7. The `display` parameter is accepted by every
adaptive-thinking-capable model, and the previous behavior (visible
summaries) is what users already see on Opus 4.6 / Sonnet 4.6, so there
is no behavior change for those models.

Release Notes:

- Restored thinking summaries for Claude Opus 4.7.
2026-04-23 15:43:02 +00:00
Ben Brandt
2eafa6e6aa
language_models: Remove unused language model token counting (#54177)
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.

I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.

Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-04-22 13:39:48 +00:00
Richard Feldman
87b47a4b08
Add Claude Opus 4.7 BYOK (#54077)
<img width="767" height="428" alt="Screenshot 2026-04-16 at 11 29 13 AM"
src="https://github.com/user-attachments/assets/e8b450fa-aefc-4dec-a286-b211bd492011"
/>

Add Claude Opus 4.7 (`claude-opus-4-7`) to the anthropic, bedrock, and
opencode provider crates.

Key specs:
- 1M token context window
- 128k max output tokens
- Adaptive thinking support
- AWS Bedrock cross-region inference (global, US, EU, AU)

Release Notes:

- Added Claude Opus 4.7 as an available language model
2026-04-17 10:51:32 -04:00
Enoch
8420716bd4
anthropic: Preserve custom model thinking mode after thinking-toggle refactor (#52975)
PR #51946 broke `Model::Custom` thinking behavior: `mode()`,
`supports_thinking()`, and `supports_adaptive_thinking()` all inferred
capabilities from hardcoded built-in model lists, so any `Custom`
variant always fell back to `Default` regardless of its configured
`mode` field.

### Fixes

- **`Model::mode()`** — `Custom` now short-circuits to `mode.clone()`
before the built-in inference logic
- **`Model::supports_thinking()`** — `Custom` returns `true` when `mode`
is `Thinking { .. }` or `AdaptiveThinking`
- **`Model::supports_adaptive_thinking()`** — `Custom` returns `true`
when `mode` is `AdaptiveThinking`

Built-in model behavior is unchanged.

### Tests

Three regression tests added covering the three `Custom` mode cases:
explicit `Thinking`, `AdaptiveThinking`, and `Default` (which must
disable both flags).

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [ ] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- Fixed custom Anthropic models losing their configured
thinking/adaptive-thinking mode after the thinking-toggle refactor
(#51946)

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-04-14 12:06:17 +03:00
Danilo Leal
399d3d267e
docs: Update mentions to "assistant panel" (#53514)
We don't use this terminology anymore; now it's "agent panel".

Release Notes:

- N/A
2026-04-09 10:42:21 -03:00
Agus Zubiaga
98c17ca160
language_models: Refactor deps and extract cloud (#53270)
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.


Release Notes:

- N/A

---------

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-07 12:28:19 -03:00
Richard Feldman
0f173eb8ae
Remove deprecated 1M context beta header for Sonnet 4.5 (#52767)
The `CONTEXT_1M_BETA_HEADER` (`context-1m-2025-08-07`) is deprecated for
Sonnet 4 and 4.5. This removes the constant from the anthropic crate and
the match arm in `beta_headers()` that sent it for
`ClaudeSonnet4_5_1mContext`.

Note: The bedrock crate still has its own copy of this constant, used
when the user-configurable `allow_extended_context` setting is enabled.
That may warrant a separate cleanup.

Closes AI-114

Release Notes:

- N/A
2026-03-31 10:54:52 -04:00
John Tur
1dfe836d3d
Remove settings dependency from anthropic (#51979)
Release Notes:

- N/A
-
2026-03-19 19:48:30 -04:00
Bennet Bo Fenner
68d96077f3
anthropic: Add support for thinking toggle (#51946)
This adds support for the thinking toggle + reasoning effort for the
Anthropic provider

Release Notes:

- anthropic: Added support for selecting reasoning effort

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2026-03-19 17:43:06 +01:00
Mikayla Maki
662f2c7857
Update BYOK to 1m context windows (#51625)
Before you mark this PR as ready for review, make sure that you have:
- [x] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [x] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- Updated our BYOK integration to support the new 1M context windows for
Opus and Sonnet.
2026-03-16 09:10:01 -07:00
Piotr Osiewicz
97421c670e
Remove unreferenced dev dependencies (#51093)
This will help with test times (in some cases), as nextest cannot figure
out whether a given rdep is actually an alive edge of the build graph

Closes #ISSUE

Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- N/A
2026-03-09 13:22:12 +01:00
John Tur
dde76cd5f8
Enable extended reasoning for Anthropic models in Copilot (#46540)
Fixes https://github.com/zed-industries/zed/issues/45668

https://github.com/microsoft/vscode-copilot-chat used as a reference for
headers and properties we need to set

| Before | After | 
| --- | --- |
| <img width="300"
src="https://github.com/user-attachments/assets/d112a9ef-52d2-42ff-a77b-4b4b15f950fe"
/>| <img width="300"
src="https://github.com/user-attachments/assets/0f1d7ae0-bee1-46f7-92ef-aea0fa6cde7a"
/> |

Release Notes:

- Enabled thinking mode when using Anthropic models with Copilot
2026-03-08 09:34:46 +00:00
Tom Houlé
6a749380aa
Add fast mode toggle in agent panel (#49714)
This is a staff only toggle for now, since the consequences of
activating it are not obvious and quite dire (tokens costs 6 times
more).

Also, persist thinking, thinking effort and fast mode in DbThread so the
thinking mode toggle and thinking effort are persisted.

Release Notes:

- Agent: The thinking mode toggle and thinking effort are now persisted
when selecting a thread from history.
2026-02-26 21:19:41 +01:00
Bennet Bo Fenner
a2e34cb7bf
agent: Implement streaming for edit file tool (#50004)
Before you mark this PR as ready for review, make sure that you have:
- [x] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [x] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- N/A

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2026-02-25 22:58:25 +00:00
Raphael Lüthy
325e941289
anthropic: Support alternative provider SSE formatting (#47847)
The issue I ran into was that responses from anthropic compatible
providers, like Kimi for Coding, have no space after `data:`. This
change just adds a quick check to also allow for those providers to
work.

Before it just resolved but did not show any output:
<img width="50%" alt="CleanShot 2026-01-28 at 12 50 31@2x"
src="https://github.com/user-attachments/assets/c3c8fe27-348e-4b21-a5f1-25bcc82f3774"
width=50%/>

Now it returns the proper result:
<img width="50%" alt="CleanShot 2026-01-28 at 12 56 30@2x"
src="https://github.com/user-attachments/assets/4e524c1e-78ab-4956-bd65-a919d46adc59"
width=50%/>

Normal Anthropic models still work as expected:
<img width="50%" alt="CleanShot 2026-01-28 at 12 58 37@2x"
src="https://github.com/user-attachments/assets/5a2906aa-1183-45b6-939b-01a6830f3385"
/>

Config to test
```json
 "language_models": {
    "anthropic": {
      "api_url": "https://api.kimi.com/coding",
      "available_models": [
          {
            "name": "kimi-for-coding",
            "display_name": "Kimi 2.5 Coding",
            "max_tokens": 262144,
            "max_output_tokens": 32768,
          },
      ],
    },
}
```


TLDR:
- Accepts SSE data:{...} lines (no space) emitted by some alternative
Anthropic providers, in addition to the standard data: {...} format.

Release Notes:

- Fixed Anthropic streaming for alternative providers by accepting SSE data:{...} (no space) lines.

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-02-20 09:47:00 +00:00
Ben Brandt
0fdf175c32
anthropic: Remove deprecated models (#49522)
Release Notes:

- anthropic: Removed models that have been deprecated from their API.
2026-02-18 20:19:38 +00:00
Richard Feldman
5670e66fa7
Add support for Claude Sonnet 4.6 (#49386)
<img width="435" height="211" alt="Screenshot 2026-02-17 at 1 32 48 PM"
src="https://github.com/user-attachments/assets/136c188d-5001-4526-961e-9f7faccc5f7a"
/>


Add support for the new Claude Sonnet 4.6 model across the anthropic,
bedrock, and language_models crates. Includes base, thinking, and 1M
context variants.

Closes AI-39

Release Notes:

- Added BYOK support for Claude Sonnet 4.6
2026-02-17 18:45:12 +00:00
Mikayla Maki
85294063fc
Strip broken thinking blocks from Anthropic requests (#48548)
TODO:

- [x] Review code
- [x] Decide whether to keep ignored API tests

Release Notes:

- Fixed a bug where cancelling a thread mid-thought would cause further
anthropic requests to fail
- Fixed a bug where the model configured on a thread would not be
persisted alongside that thread
2026-02-07 04:21:58 +00:00
Richard Feldman
acbc6a16ac
Remove fine-grained tool streaming beta header (now GA) (#48631)
Fine-grained tool streaming is now [generally available on all models
and
platforms](https://platform.claude.com/docs/en/release-notes/overview#february-5-2026)
as of February 5, 2026, so the `fine-grained-tool-streaming-2025-05-14`
beta header is officially listed as no longer needed.

See
https://github.com/zed-industries/zed/pull/48508#discussion_r2773653965

Release Notes:

- N/A
2026-02-06 21:49:14 +00:00
Marshall Bowers
9860106b8e
agent: Add support for setting thinking effort for Zed provider (#48545)
This PR adds the ability to set the thinking effort of a model.

Right now this only applies to Opus 4.6 through the Zed provider.

This is gated behind the `cloud-thinking-toggle` feature flag.

UI is still rough; needs a design pass:

<img width="639" height="163" alt="Screenshot 2026-02-05 at 7 45 54 PM"
src="https://github.com/user-attachments/assets/2b5a9ef8-74cd-498e-9c81-b92666572409"
/>

<img width="263" height="148" alt="Screenshot 2026-02-05 at 7 45 58 PM"
src="https://github.com/user-attachments/assets/40232cb0-1743-443b-b04c-5cd33065513d"
/>

Release Notes:

- N/A
2026-02-06 01:04:53 +00:00
Marshall Bowers
7dcff21dc9
anthropic: Update types for adaptive thinking (#48517)
This PR updates the Anthropic types with support for [adaptive
thinking](https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking).

We're not actually using adaptive thinking yet.

Release Notes:

- N/A
2026-02-05 19:55:42 +00:00
Richard Feldman
24b6cbf575
Add Claude Opus 4.6 and 1M context window model variants (#48508)
<img width="588" height="485" alt="Screenshot 2026-02-05 at 1 29 10 PM"
src="https://github.com/user-attachments/assets/f3d36c8b-b371-4226-af60-bdc2c6b34009"
/>
<img width="586" height="468" alt="Screenshot 2026-02-05 at 1 30 15 PM"
src="https://github.com/user-attachments/assets/878e91ad-948c-4b35-a37b-f5a8db7e0b3f"
/>


This adds Claude Opus 4.6 as a new Anthropic model, along with 1M
context window variants for both Opus 4.6 and Sonnet 4.5.

## Opus 4.6

Adds `ClaudeOpus4_6` and `ClaudeOpus4_6Thinking` with the same
properties as other Claude 4+ models (200k context, 8192 max output
tokens, fine-grained tool streaming beta header).

## 1M context variants

Adds 1M context window variants for Sonnet 4.5 and Opus 4.6. These are
identical to their base models except:
- Context window is 1,000,000 tokens instead of 200,000
- They send the `context-1m-2025-08-07` beta header

Release Notes:

- Added Claude Opus 4.6
- Now Claude Opus 4.6 and Sonnet 4.5 BYOK models support variations that
have context windows of 1 million tokens (and have different pricing)
2026-02-05 18:50:04 +00:00
Oleksiy Syvokon
4c46872ab7
ep: Handle errored requests in Anthropic batches (#46351)
Also, save all requests in a single sqlite transaction -- much faster.

Release Notes:

- N/A
2026-01-08 10:59:03 +00:00
Richard Feldman
d16619a654
Improve token count accuracy using Anthropic's API (#44943)
Closes #38533

<img width="807" height="425" alt="Screenshot 2025-12-16 at 2 32 21 PM"
src="https://github.com/user-attachments/assets/6ebb915c-91d3-4158-a2b9-9fe17d301dd6"
/>


Release Notes:

- Use up-to-date token counts from LLM responses when reporting tokens
used per thread

---------

Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-16 14:32:41 -05:00
Michael Benfield
488fa02547
Streaming tool use for inline assistant (#44751)
Depends on: https://github.com/zed-industries/zed/pull/44753

Release Notes:

- N/A

---------

Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-12-14 03:22:20 +00:00
Oleksiy Syvokon
d312d59ace
Add zeta distill command (#44369)
This PR partially implements a knowledge distillation data pipeline.

`zeta distill` gets a dataset of chronologically ordered commits and
generates synthetic predictions with a teacher model (one-shot Claude
Sonnet).

`zeta distill --batches cache.db` will enable Message Batches API. Under
the first run, this command will collect all LLM requests and upload a
batch of them to Anthropic. On subsequent runs, it will check the batch
status. If ready, it will download the result and put them into the
local cache.


Release Notes:

- N/A

---------

Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>
2025-12-08 15:13:22 +02:00
Mikayla Maki
bd2c1027fa
Add support for Opus 4.5 (#43425)
Adds support for Opus 4.5
- [x] BYOK
- [x] Amazon Bedrock

Release Notes:

- Added support for Opus 4.5

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-11-24 20:01:43 +00:00
Mikayla Maki
de58a496ef
Fix a bug where Anthropic completions would not work on nightly (#43287)
Follow up to: https://github.com/zed-industries/zed/pull/43185/files

Release Notes:

- N/A

Co-authored-by: Michael <mbenfield@zed.dev>
2025-11-22 00:10:26 +00:00
Andrew Farkas
6899448812
Remove prompt-caching-2024-07-31 beta header for Anthropic AI (#43185)
Closes #42715

Release Notes:

- Remove `prompt-caching-2024-07-31` beta header for Anthropic AI

Co-authored-by: Cole Miller <cole@zed.dev>
2025-11-20 15:16:09 -05:00
Julia Ryan
ef5b8c6fed
Remove workspace-hack (#40216)
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.

Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.

Release Notes:

- N/A
2025-10-17 18:58:14 +00:00
versecafe
2adc023094
anthropic: Haiku 4.5 support (#40298)
Release Notes:

- Added Claude Haiku 4.5

<img width="1512" height="919" alt="Screenshot 2025-10-15 at 5 23 37 PM"
src="https://github.com/user-attachments/assets/fd3eb8e7-ddd8-4d38-a171-400949c0cef4"
/>
2025-10-16 15:59:12 -06:00
Richard Feldman
3ae65153db
Default to Sonnet 4.5 in BYOK (#39132)
<img width="381" height="204" alt="Screenshot 2025-09-29 at 2 29 58 PM"
src="https://github.com/user-attachments/assets/c7aaf0b0-b09b-4ed9-8113-8d7b18eefc2f"
/>


Release Notes:

- Claude Sonnet 4.5 and 4.5 Thinking are now the recommended Anthropic
models
2025-09-29 18:56:03 +00:00
Richard Feldman
4fc4707cfc
Add Sonnet 4.5 support (#39127)
Release Notes:

- Added support for Claude Sonnet 4.5 for Bring-Your-Own-Key (BYOK)
2025-09-29 14:21:58 -04:00
Conrad Irwin
fcdab160f9
Settings refactor (#38367)
Co-Authored-By: Ben K <ben@zed.dev>
Co-Authored-By: Anthony <anthony@zed.dev>
Co-Authored-By: Mikayla <mikayla@zed.dev>

Release Notes:

- settings: Major internal changes to settings. The primary user-facing
effect is that some settings which did not make sense in project
settings files are no-longer read from there. (For example the inline
blame settings)

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
Co-authored-by: Anthony <anthony@zed.dev>
2025-09-18 16:47:23 +00:00
Umesh Yadav
3c021d0890
language_models: Fix beta_headers for Anthropic custom models (#37306)
Closes #37289

The current implementation has a problem. The **`from_id` method** in
the Anthropic crate works well for predefined models, but not for custom
models that are defined in the settings. This is because it fallbacks to
using default beta headers, which are incorrect for custom models.

The issue is that the model instance for custom models lives within the
`language_models` provider, so I've updated the **`stream_completion`**
method to explicitly accept beta headers from its caller. Now, the beta
headers are passed from the `language_models` provider all the way to
`anthropic.stream_completion`, which resolves the issue.

Release Notes:

- Fixed a bug where extra_beta_headers defined in settings for Anthropic
custom models were being ignored.

---------

Signed-off-by: Umesh Yadav <git@umesh.dev>
2025-09-04 06:02:13 +02:00
Antonio Scandurra
39d86eeb7f
Trim API key when submitting requests to LLM providers (#37082)
This prevents the common footgun of copy/pasting an API key
starting/ending with extra newlines, which would lead to a "bad request"
error.

Closes #37038 

Release Notes:

- agent: Support pasting language model API keys that contain newlines.
2025-08-28 12:00:44 +00:00
Richard Feldman
0b5592d788
Add Claude Opus 4.1 (#35653)
<img width="348" height="427" alt="Screenshot 2025-08-05 at 1 55 35 PM"
src="https://github.com/user-attachments/assets/52af17a5-0095-4ad9-9afe-ff27aab90e03"
/>

Release Notes:

- Added support for Claude Opus 4.1

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-08-05 18:16:47 +00:00
Michael Sloan
d497f52e17
agent: Improve error handling and retry for zed-provided models (#33565)
* Updates to `zed_llm_client-0.8.5` which adds support for `retry_after`
when anthropic provides it.

* Distinguishes upstream provider errors and rate limits from errors
that originate from zed's servers

* Moves `LanguageModelCompletionError::BadInputJson` to
`LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably
this is an error case, the logic in thread is cleaner with this move.
There is also precedent for inclusion of errors in the event type -
`CompletionRequestStatus::Failed` is how cloud errors arrive.

* Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types
instead of `&str`, since they can be constructed in a const fashion.

* Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME`
as the server no longer reads this header and just defaults to that
behavior.

Release notes for this is covered by #33275

Release Notes:

- N/A

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Richard <richard@zed.dev>
2025-06-30 21:01:32 -06:00
Richard Feldman
c610ebfb03
Thread Anthropic errors into LanguageModelKnownError (#33261)
This PR is in preparation for doing automatic retries for certain
errors, e.g. Overloaded. It doesn't change behavior yet (aside from some
granularity of error messages shown to the user), but rather mostly
changes some error handling to be exhaustive enum matches instead of
`anyhow` downcasts, and leaves some comments for where the behavior
change will be in a future PR.

Release Notes:

- N/A
2025-06-23 18:48:26 +00:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Marshall Bowers
fcf5042007
anthropic: Reorder Model variants in descending order (#32689)
This PR reorders the `Model` variants in the `anthropic` crate in
descending order.

Newer/more powerful models at the top -> older/less powerful models at
the bottom.

Release Notes:

- N/A
2025-06-13 14:01:32 +00:00
Marshall Bowers
cb9beb86bf
anthropic: Refactor a bit (#32685)
This PR applies some refactorings made in our other repos to this
version of the `anthropic` crate.

Release Notes:

- N/A
2025-06-13 13:34:23 +00:00
Ben Brandt
e4bd115a63
More resilient eval (#32257)
Bubbles up rate limit information so that we can retry after a certain
duration if needed higher up in the stack.

Also caps the number of concurrent evals running at once to also help.

Release Notes:

- N/A
2025-06-09 18:07:22 +00:00