Commit graph

305 commits

Author SHA1 Message Date
Bennet Bo Fenner
68340172a1
agent: Remove unused LanguageModelImage APIs (#57050)
Pulled out from #56866. Will help with MCP image support

Release Notes:

- N/A
2026-05-18 12:22:22 +00:00
Bennet Bo Fenner
3a742b5e0d
language_models: Remove unused cache_configuration API (#56884)
Release Notes:

- N/A
2026-05-15 16:27:11 +00:00
Bennet Bo Fenner
bf3fc2336d
agent: Allow tools to output multiple content parts (#54518)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- N/A
2026-04-27 12:36:11 +00:00
Ben Brandt
2eafa6e6aa
language_models: Remove unused language model token counting (#54177)
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.

I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.

Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-04-22 13:39:48 +00:00
Daniel Strobusch
f2f9e2766b
language_model: Fix ImageSize storing pre-downscale dimensions (#54357)
When images are resized to meet provider size constraints (Anthropic's
1568px limit or the 5MB encoded-PNG cap), the stored ImageSize was still
recording the original width/height rather than the final post-downscale
dimensions. This caused incorrect token estimation via estimate_tokens()
since it uses width * height / 750.

Use processed_image.dimensions() after all downscale passes so that
ImageSize reflects the actual image sent to the provider.

Release Notes:

- Fixed an issue where token estimation would be incorrect in case where
the thread contained downscaled images.
2026-04-21 12:36:22 +02:00
Anthony Eid
e92a40a9d8
agent: Auto-select user model when there's no default (#54125)
Reimplements #36722 while fixing the race that required the revert in
#36932.

When no default model is configured, this picks an environment fallback
by authenticating all providers. It always prefers the Zed cloud
provider when it's authenticated, and waits for its models to load
before picking another provider as the fallback, so we don't flicker
from Zed models to Anthropic while sign-in is in flight.

The fallback is recomputed whenever provider state changes (via
`ProviderStateChanged`/`AddedProvider`/`RemovedProvider` events), so the
selection becomes correct as soon as cloud models arrive.

### What changed vs. the original PR

- `language_models::init` now owns `authenticate_all_providers`
(previously done in `LanguageModelPickerDelegate` and `agent`'s
`LanguageModels`).
- After all authentications settle, and on any subsequent provider state
change, `update_environment_fallback_model` recomputes the fallback.
- The fallback logic prefers Zed cloud: if the cloud provider is
authenticated, only use it (waiting for its models to load). Otherwise,
fall through to the first authenticated provider with a default or
recommended model.
- `LanguageModelRegistry::default_model()` falls back to
`environment_fallback_model` when no explicit default is set.
- Existing `Thread`s that are empty are updated to the new default when
`DefaultModelChanged` fires, so a blank thread started before sign-in
switches to Zed models once the user signs in.

Release Notes:

- agent: Automatically select a model when there's no selected model or configured default
2026-04-16 18:35:47 -04:00
Agus Zubiaga
98c17ca160
language_models: Refactor deps and extract cloud (#53270)
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.


Release Notes:

- N/A

---------

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-07 12:28:19 -03:00
Bennet Bo Fenner
e2bba5526a
agent: Fix issue with streaming tools when model produces invalid JSON (#52891)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- N/A
2026-04-06 22:26:26 +02:00
Jakub Konka
29609d3599
language_model: Decouple from Zed-specific implementation details (#52913)
This PR decouples `language_model`'s dependence on Zed-specific
implementation details. In particular
* `credentials_provider` is split into a generic `credentials_provider`
crate that provides a trait, and `zed_credentials_provider` that
implements the said trait for Zed-specific providers and has functions
that can populate a global state with them
* `zed_env_vars` is split into a generic `env_var` crate that provides
generic tooling for managing env vars, and `zed_env_vars` that contains
Zed-specific statics
* `client` is now dependent on `language_model` and not vice versa

Release Notes:

- N/A
2026-04-02 17:06:57 -03:00
Jakub Konka
6663a60876
language_model: Refactor crate structure and dependencies (#52857)
A couple of things that this PR wants to accomplish:
* remove dependency on `settings` crate from `language_model`
* refactor provider-specific code into submodules - to be honest, I
would go one step further and put all provider-specific bits in
`language_models` instead but I realise we have cloud logic in
`language_model` which uses those too making it tricky
* move anthropic-specific telemetry into `language_models` crate - I
think it makes more sense for it to be there

Anyhow, I would very appreciate if you could have a look @mikayla-maki
and @maxdeviant and lemme know what you think, if you would tweak
something, etc.

Release Notes:

- N/A
2026-04-01 17:44:25 +02:00
Ben Brandt
76c6004b27
Remove text thread and slash command crates (#52757)
🫡

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- Removed legacy Text Threads feature to help streamline the new agentic
workflows in Zed. Thanks to all of you who were enthusiastic Text Thread
users over the years ❤️!

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2026-03-31 17:55:05 +02:00
Bennet Bo Fenner
46a0262dc5
agent: Remove duplicated description from tool schema (#52678)
Turns out we were including the description of a tool inside the schema
again, which I don't think is needed.
Before:
```
LanguageModelRequestTool {
    name: "web_search",
    description: "Search the web for information using your query.\nUse this when you need real-time information, facts, or data that might not be in your training.\nResults will include snippets and links from relevant web pages.",
    input_schema: Object {
        "required": Array [
            String("query"),
        ],
        "description": String("Search the web for information using your query.\nUse this when you need real-time information, facts, or data that might not be in your training.\nResults will include snippets and links from relevant web pages."),
        "type": String("object"),
        "properties": Object {
            "query": Object {
                "description": String("The search term or question to query on the web."),
                "type": String("string"),
            },
        },
        "additionalProperties": Bool(false),
    },
    use_input_streaming: false,
},
```



After:
```
LanguageModelRequestTool {
    name: "web_search",
    description: "Search the web for information using your query.\nUse this when you need real-time information, facts, or data that might not be in your training.\nResults will include snippets and links from relevant web pages.",
    input_schema: Object {
        "required": Array [
            String("query"),
        ],
        "type": String("object"),
        "properties": Object {
            "query": Object {
                "description": String("The search term or question to query on the web."),
                "type": String("string"),
            },
        },
        "additionalProperties": Bool(false),
    },
    use_input_streaming: false,
},
```


Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #45315

Release Notes:

- agent: Reduced amount of tokens consumed by tool descriptions
2026-03-29 20:59:18 +02:00
Om Chillure
ef4af8f924
Fix/gemini tool schema unsupported keys (#52670)
### Summary

The Gemini API enforces strict validation on `function_declarations` and
rejects requests containing unsupported JSON Schema keywords such as
`additionalProperties`, `propertyNames`. This caused Write mode to fail
with "failed to stream completion" when tools with complex schemas were
used.

This PR strips these unsupported keywords from tool schemas before
sending them to the Gemini API in `adapt_to_json_schema_subset`.

### How to Review

- Check `crates/language_model/src/tool_schema.rs` — the
`adapt_to_json_schema_subset` function now removes
`additionalProperties` and `propertyNames` from schemas.
- Tests are added covering removal of these keys and nested schema
handling.
- To reproduce the original issue, send a tool schema containing
`propertyNames` or `additionalProperties` to the Gemini API — it returns
HTTP 400 `INVALID_ARGUMENT`

### How to Test

Run the unit tests:
```sh
cargo test -p language_model
```

OR manually reproduce this using ->

```
curl -s "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=YOUR_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"contents":[{"parts":[{"text":"test"}]}],"tools":[{"functionDeclarations":[{"name":"test","parameters":{"type":"OBJECT","properties":{"field":{"type":"OBJECT","propertyNames":{"pattern":"^[a-z]+$"},"additionalProperties":{"type":"STRING"}}}}}]}]}'
```


#### Closes #52430

- [x] I've reviewed my own diff for quality, security, and reliability
- [ ] Unsafe blocks (if any) have justifying comments
- [ ] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Video 

[Screencast from 2026-03-29
08-32-18.webm](https://github.com/user-attachments/assets/a0069f0e-1f2b-45dc-85bf-f24aacb08599)


### Note : Reopens previous work from closed PR #52644 (fork was
deleted)


Release Notes:

- Fixed an issue where Gemini models would not work when using specific
MCP servers
2026-03-29 18:31:42 +02:00
Bennet Bo Fenner
dd0d87f4ee
eval: Improve StreamingEditFileTool performance (#52428)
## Context

| Eval | Score |
|------|-------|
| eval_delete_function | 1.00 |
| eval_extract_handle_command_output | 0.96 |
| eval_translate_doc_comments | 0.96 |

Porting the rest of the evals is still a todo.

## Self-Review Checklist

<!-- Check before requesting review: -->
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2026-03-26 14:53:49 +01:00
Marshall Bowers
72bc4dc534
cloud_llm_client: Move CompletionIntent to language_model (#52359)
This PR moves the `CompletionIntent` enum from the `cloud_llm_client`
crate to the `language_model` crate, as it is no longer part of the
Cloud interface.

Release Notes:

- N/A
2026-03-25 08:39:17 +01:00
Ben Brandt
b28990e471
language_models: Use weak entities in subscribe/observes around the language model registry (#52312)
## Context

I was getting some leak detection failures in evals and tracked it down
to these entities getting passed into observe/subscribe callbacks and
causing cycles.

Release Notes:

- N/A

Co-authored-by: Lukas Wirth <me@lukaswirth.dev>
2026-03-24 12:08:44 +01:00
Tom Houlé
a30e4d5228
language_model: Clear the LlmApiToken first on org switch (#51826)
When we switch organizations, we try and refresh the token. If the token
refresh fails, we are left with the old LlmApiToken, which is for the
wrong organization. In this commit, we make sure to clear the old token
before trying a refresh on organization switch.

Release Notes:

- N/A

---------

Co-authored-by: Neel <neel@zed.dev>
2026-03-19 11:09:55 +01:00
Ben Brandt
38b7c76198
open_ai: Fix tool output for /responses endpoint (#51789)
We were sending the raw tool debug output as JSON to the model rather
than whatever the tool intended as content for the model.
Which meant we were sending unneeded information to the model, which
matters in the edit tool case.

Release Notes:

- N/A
2026-03-17 22:14:51 +00:00
Marshall Bowers
a07d0f4d21
Assign meaningful names to some single-letter bindings (#51432)
This PR assigns meaningful names to some single-letter bindings we were
using to refer to the organization.

Release Notes:

- N/A
2026-03-12 22:49:17 +00:00
Tom Houlé
9fb57b0daf
language_model: Centralize LlmApiToken to a singleton (#51225)
The edit prediction, web search and completions endpoints in Cloud all
use tokens called LlmApiToken. These were independently created, cached,
and refreshed in three places: the cloud language model provider, the
edit prediction store, and the cloud web search provider. Each held its
own LlmApiToken instance, meaning three separate requests to get these
tokens at startup / login and three redundant refreshes whenever the
server signaled a token update was needed.

We already had a global singleton reacting to the refresh signals:
RefreshLlmTokenListener. It now holds a single LlmApiToken that all
three services use, performs the refresh itself, and emits
RefreshLlmTokenEvent only after the token is fresh. That event is used
by the language model provider to re-fetch models after a refresh. The
singleton is accessed only through `LlmApiToken::global()`.

I have tested this manually, and it token acquisition and usage appear
to be working fine.

Edit: I've tested it with a long running session, and refresh seems to
be working fine too.

Release Notes:

- N/A

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2026-03-11 20:54:51 +01:00
Neel
dc0e41f834
Refresh LLM API token on organization change (#50931)
Emit client-side organization changed events through
`RefreshLlmTokenListener` so it produces the same `RefreshLlmTokenEvent`
used for server-pushed `UserUpdated` messages.

This keeps token refresh fan-out in one place.

Closes CLO-383.

Release Notes:

- N/A

---------

Co-authored-by: Tom Houlé <tom@tomhoule.com>
2026-03-06 19:15:21 +00:00
Tom Houlé
a1d40370cf
cloud_api_client: Send the organization ID in LLM token requests (#50517)
This is already expected on the cloud side. This lets us know under
which organization the user is logged in when requesting an llm_api
token.

Closes CLO-337

Release Notes:

- N/A
2026-03-04 15:45:33 +01:00
Tom Houlé
6a749380aa
Add fast mode toggle in agent panel (#49714)
This is a staff only toggle for now, since the consequences of
activating it are not obvious and quite dire (tokens costs 6 times
more).

Also, persist thinking, thinking effort and fast mode in DbThread so the
thinking mode toggle and thinking effort are persisted.

Release Notes:

- Agent: The thinking mode toggle and thinking effort are now persisted
when selecting a thread from history.
2026-02-26 21:19:41 +01:00
Bennet Bo Fenner
a2e34cb7bf
agent: Implement streaming for edit file tool (#50004)
Before you mark this PR as ready for review, make sure that you have:
- [x] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [x] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)

Release Notes:

- N/A

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2026-02-25 22:58:25 +00:00
Marshall Bowers
42202edee9
Sign out upon receiving an Unauthorized response when acquiring an LLM token (#49673)
This PR makes it so the user gets signed out upon receiving an
Unauthorized response when acquiring an LLM token.

This is a re-landing of #49661.

Closes CLO-324.

Release Notes:

- N/A
2026-02-19 22:13:38 -05:00
Marshall Bowers
94b9628d42
Revert "Sign out upon receiving an Unauthorized response when acquiring an LLM token (#49661) (#49669)
This PR reverts #49661, as the Collab tests are failing (but were not
caught in CI).

This reverts commit 2f9350bb6b.

Release Notes:

- N/A
2026-02-19 23:54:03 +00:00
Marshall Bowers
2f9350bb6b
Sign out upon receiving an Unauthorized response when acquiring an LLM token (#49661)
This PR makes it so the user gets signed out upon receiving an
Unauthorized response when acquiring an LLM token.

Closes CLO-324.

Release Notes:

- N/A
2026-02-19 22:23:40 +00:00
Tom Houlé
1702a05920
cloud_llm_client: Delete unused variants of CompletionRequestStatus (#49516)
Small clean up commit.

Co-authored-by: Marshall <marshall@zed.dev>

Release Notes:

- N/A
2026-02-18 14:30:40 -05:00
Sathiyaraman M
6e33d838c9
copilot: Display cost multiplier for Github Copilot models (#44800)
### Description

Related Discussions: #44499, #35742, #31851

Display cost multiplier for GitHub Copilot models in the model selectors
(Both in Chat Panel and Inline Assistant)

<img width="436" height="800" alt="image"
src="https://github.com/user-attachments/assets/c9ebd8fa-4d55-4be8-b3e1-f46dbf1f0145"
/>


### Some technical notes

Although this PR's primary intent is to show the cost multiplier for
GitHub Copilot models alone, I have included some necessary plumbing to
allow specifying costs for other providers in future. I have introduced
an enum called `LanguageModelCostInfo` for showing cost in different
ways for different models. Now, this enum is used in `LanguageModel`
trait to get the cost info.

For now to begin with, in `LanguageModelCostInfo`, I have specified two
ways of pricing: Request-based (1 Agent request - GitHub Copilot uses
this) and Token-based (1M Input tokens / 1M Output tokens). I had
initially thought about adding a `Free` type, especially for Ollama but
didn't do it after realizing that Ollama has paid plans. Right now, only
the Request-based pricing is implemented and used for Copilot models.

Feel free to suggest changes on how to improve this design better.

Release Notes:

- Show cost multiplier for GitHub Copilot models

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2026-02-16 15:24:59 -03:00
Tom Houlé
93ead966c2
cloud_llm_client: Add StreamEnded and Unknown variants to CompletionRequestStatus (#49121)
Add StreamEnded variant so the client can distinguish between a stream
that the cloud ran to completion versus one that was interrupted (see
CLO-258). **That logic is to be added in a follow up PR**.

Add an Unknown fallback with #[serde(other)] for forward-compatible
deserialization of future variants.

The client advertises support via a new
x-zed-client-supports-stream-ended-request-completion-status header. The
server will only send the new variant if that header is passed. Both
StreamEnded and Unknown are silently ignored at the event mapping layer
(from_completion_request_status returns Ok(None)).

Part of CLO-264 and CLO-266; cloud-side changes to follow.

Release Notes:

- N/A

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2026-02-16 15:39:47 +01:00
Mikayla Maki
85294063fc
Strip broken thinking blocks from Anthropic requests (#48548)
TODO:

- [x] Review code
- [x] Decide whether to keep ignored API tests

Release Notes:

- Fixed a bug where cancelling a thread mid-thought would cause further
anthropic requests to fail
- Fixed a bug where the model configured on a thread would not be
persisted alongside that thread
2026-02-07 04:21:58 +00:00
Marshall Bowers
afafb66f76
agent: Highlight latest models available through the Zed provider (#48614)
This PR updates the model selector to highlight the latest models that
are available through the Zed provider:

<img width="388" height="477" alt="Screenshot 2026-02-06 at 1 46 41 PM"
src="https://github.com/user-attachments/assets/70760399-ecf6-46e3-80a7-cb998216c192"
/>

Closes CLO-205.

Release Notes:

- Added a "Latest" indicator to highlight the latest models available
through the Zed provider.
2026-02-06 14:03:03 -05:00
Marshall Bowers
9860106b8e
agent: Add support for setting thinking effort for Zed provider (#48545)
This PR adds the ability to set the thinking effort of a model.

Right now this only applies to Opus 4.6 through the Zed provider.

This is gated behind the `cloud-thinking-toggle` feature flag.

UI is still rough; needs a design pass:

<img width="639" height="163" alt="Screenshot 2026-02-05 at 7 45 54 PM"
src="https://github.com/user-attachments/assets/2b5a9ef8-74cd-498e-9c81-b92666572409"
/>

<img width="263" height="148" alt="Screenshot 2026-02-05 at 7 45 58 PM"
src="https://github.com/user-attachments/assets/40232cb0-1743-443b-b04c-5cd33065513d"
/>

Release Notes:

- N/A
2026-02-06 01:04:53 +00:00
Marshall Bowers
a2ca07514c
language_model: Add supported_effort_levels method to LanguageModel (#48523)
This PR adds a new `supported_effort_levels` method to the
`LanguageModel` trait.

This can be used to retrieve the list of effort levels that the model
supports, which will eventually be used to drive the UI for selecting
the thinking effort.

Right now this list will only be populated for Cloud models.

Release Notes:

- N/A
2026-02-05 22:20:08 +00:00
Marshall Bowers
25904f691e
Add support for refreshing outdated LLM tokens (#47512)
This PR adds support for refreshing LLM tokens that are "outdated"—that
is, that are missing some required claims.

Release Notes:

- Fixed some instances of authentication errors with the Zed API that
could be resolved automatically by refreshing the token.
2026-01-23 21:03:28 +00:00
Marshall Bowers
097cfae77e
Add helper method for checking if the LLM token needs to be refreshed (#47511)
This PR adds a new `needs_llm_token_refresh` helper method for checking
if the LLM token needs to be refreshed.

We were duplicating the check for the `x-zed-expired-token` header in a
number of spots, and it will be gaining an additional case soon.

Release Notes:

- N/A
2026-01-23 20:50:50 +00:00
Richard Feldman
29cf14ed2f
Fix rate limiter holding permits during tool execution (#47494)
The rate limiter's semaphore guard was being held for the entire
duration of a turn, including during tool execution. This caused
deadlocks when subagents tried to acquire permits while parent requests
were waiting for them to complete.

## The Problem

In `run_turn_internal`, the stream (which contains the `RateLimitGuard`
holding the semaphore permit) was kept alive throughout the entire loop
iteration - including during **tool execution**:

1. Parent request acquires permit
2. Parent starts streaming, consumes response
3. Parent starts executing tools (subagents)
4. **Stream/guard still held** while tools execute
5. Subagents try to acquire permits → blocked because parent still holds
permit
6. Deadlock if all permits are held by parents waiting for subagent
children

## The Fix

Two changes were made:

1. **Drop the stream early**: Added an explicit `drop(events)` after the
stream is fully consumed but before tool execution begins. This releases
the rate limit permit so subagents can acquire it.

2. **Removed the `bypass_rate_limit` workaround**: Since the root cause
is now fixed, the bypass mechanism is no longer needed.

Note: no release notes because subagents are still feature-flagged, and
this rate limiting change isn't actually observable without them.

Release Notes:

- N/A
2026-01-23 12:15:55 -05:00
Marshall Bowers
ec981b8301
agent: Add thinking toggle for Zed provider (#47407)
This PR adds a thinking toggle for controlling whether to use thinking
for a model in the Zed provider:

<img width="645" height="142" alt="Screenshot 2026-01-22 at 12 34 01 PM"
src="https://github.com/user-attachments/assets/9aa543fe-e708-4840-8b38-1a6fbcb78388"
/>

Previously we would create separate "Thinking" variants of the models
that supported thinking in the model selector.

This only applies to Anthropic models in the Zed provider, currently.

This is gated behind the `cloud-thinking-toggle` feature flag.

Release Notes:

- N/A

---------

Co-authored-by: Neel <neel@zed.dev>
2026-01-22 18:08:32 +00:00
Richard Feldman
21050e2d37
Fix nested request rate limiting deadlock for subagent edit_file (#47232)
## Problem

When subagents use the `edit_file` tool, it creates an `EditAgent` that
makes its own model request to get the edit instructions. These "nested"
requests compete with the parent subagent conversation requests for rate
limiter permits.

The rate limiter uses a semaphore with a limit of 4 concurrent requests
per model instance. When multiple subagents run in parallel:

1. 3 subagents each hold 1 permit for their ongoing conversation streams
(3 permits used)
2. When all 3 try to use `edit_file` simultaneously, their edit agents
need permits too
3. Only 1 edit agent can get the 4th permit; the other 2 block waiting
4. The blocked edit agents can't complete, so their parent subagent
conversations can't complete
5. The parent conversations hold their permits, so the blocked edit
agents stay blocked
6. **Deadlock**

## Solution

Added a `bypass_rate_limit` field to `LanguageModelRequest`. When set to
`true`, the request skips the rate limiter semaphore entirely. The
`EditAgent` sets this flag because its requests are already "part of" a
rate-limited parent request.

(No release notes because subagents are still feature-flagged.)

Release Notes:
- N/A

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2026-01-20 21:51:54 -05:00
Mason Palmer
c1da016013
agent: Patch image format bug (#45978)
Closes #44694

Release Notes:

- Fixed images being converted to png but retaining old format

## Before:
<img width="574" height="327" alt="Screenshot 2026-01-02 at 5 34 24 PM"
src="https://github.com/user-attachments/assets/92331939-cebc-4f53-99fc-10d5181cf87e"
/>

## After:
<img width="638" height="489" alt="Screenshot 2026-01-02 at 5 34 36 PM"
src="https://github.com/user-attachments/assets/47c05906-fa56-4a53-abd4-790c42230772"
/>

---------

Co-authored-by: versecafe <147033096+versecafe@users.noreply.github.com>
Co-authored-by: MrSubidubi <finn@zed.dev>
2026-01-19 21:13:14 +00:00
Marshall Bowers
a92df1eee4
Remove Burn Mode code (#46950)
This PR removes the code for Burn Mode, as we won't need it anymore
after the 17th.

Closes CLO-79.

Release Notes:

- N/A
2026-01-15 21:28:33 +00:00
Marshall Bowers
6fcc5e9461
Remove legacy billing code (#46927)
This PR removes the code for the legacy plans.

No more users will be on this plan as of January 17th, so it's fine to
land these changes now (as they won't be released until the 21st).

Closes CLO-76.

Release Notes:

- N/A
2026-01-15 13:06:45 -05:00
Mikayla Maki
9c5fc6ecbd
Split token display for OpenAI (#46829)
This feature cost $15.

Up -> Tokens we're sending to the model
Down -> Tokens we've received from the model.

<img width="377" height="69" alt="Screenshot 2026-01-14 at 12 31 01 PM"
src="https://github.com/user-attachments/assets/fc15824f-de5d-466b-8cc1-329f3c1940bb"
/>



Release Notes:

- Changed the display of tokens for OpenAI models to reflect the
input/output limits.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 14:29:56 -08:00
Nathan Sobo
b091cc4d9a
Enforce 5MB per-image limit when converting images for language models (#45313)
## Problem

When users paste or drag large images into the agent panel, the encoded
payload can exceed upstream provider limits (e.g., Anthropic's 5MB
per-image limit), causing API errors.

## Solution

Enforce a default 5MB limit on encoded PNG bytes in
`LanguageModelImage::from_image`:

1. Apply existing Anthropic dimension limits first (1568px max in either
dimension)
2. Iteratively downscale by ~15% per pass until the encoded PNG is under
5MB
3. Return `None` if the image can't be shrunk within 8 passes
(fail-safe)

The limit is enforced at the `LanguageModelImage` conversion layer,
which is the choke point for all image ingestion paths (agent panel
paste/drag, file mentions, text threads, etc.).

## Future Work

The 5MB limit is a conservative default. Provider-specific limits can be
introduced later by adding a `from_image_with_constraints` API.

## Testing

Added a regression test that:
1. Generates a noisy 4096x4096 PNG (guaranteed >5MB)
2. Converts it via `LanguageModelImage::from_image`
3. Asserts the result is ≤5MB and was actually downscaled

---

**Note:** This PR builds on #45312 (prompt store fail-open fix). Please
merge that first.

cc @rtfeldman

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2025-12-19 15:04:41 -05:00
Richard Feldman
6055b45ee1
Add support for provider extensions (but no extensions yet) (#45277)
This adds support for provider extensions but doesn't actually add any
yet.

Release Notes:

- N/A
2025-12-18 17:05:04 -05:00
Xiaobo Liu
a176a8c47e
agent: Allow LanguageModelImage size to be optional (#44956)
Release Notes:

- Improved allow LanguageModelImage size to be optional

Signed-off-by: Xiaobo Liu <cppcoffee@gmail.com>
2025-12-16 08:50:40 +00:00
Mikayla Maki
d7da5d3efd
Finish inline telemetry changes (#44842)
Closes #ISSUE

Release Notes:

- N/A
2025-12-15 04:07:44 +00:00
Michael Benfield
488fa02547
Streaming tool use for inline assistant (#44751)
Depends on: https://github.com/zed-industries/zed/pull/44753

Release Notes:

- N/A

---------

Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-12-14 03:22:20 +00:00
Danilo Leal
0283bfb049
Enable configuring edit prediction providers through the settings UI (#44505)
- Edit prediction providers can now be configured through the settings
UI
- Cleaned up the status bar menu to only show _configured_ providers
- Added to the status bar icon button tooltip the name of the active
provider
- Only display the data collection functionality under "Privacy" for the
Zed models
- Moved the Codestral edit prediction provider out of the Mistral
section in the agent panel into the settings UI
- Refined and improved UI and states for configuring GitHub Copilot as
both an agent and edit prediction provider

#### Todos before merge:

- [x] UI: Unify with settings UI style and tidy it all up
- [x] Unify Copilot modal `impl`s to use separate window
- [x] Remove stop light icons from GitHub modal
- [x] Make dismiss events work on GitHub modal
- [ ] Investigate workarounds to tell if Copilot authenticated even when
LSP not running


Release Notes:

- settings_ui: Added a section for configuring edit prediction providers
under AI > Edit Predictions, including Codestral and GitHub Copilot.
Once you've updated you can use the following link to open it:
zed://settings/edit_predictions.providers

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
2025-12-13 11:06:30 -05:00
Michael Benfield
5cd30e5106
inline assistant: Use tools and remove insertion mode (#44248)
Co-authored by: Mikayla Maki <mikayla.c.maki@gmail.com>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>

Release Notes:

- N/A
2025-12-05 13:28:29 -08:00