Self-Review Checklist:
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Closes #ISSUE
Release Notes:
- N/A or Added/Fixed/Improved ...
This change lays the groundwork for canceling in-flight requests.
Without it, we always wait for a request to complete, even when we
already know its result won't be used.
Release Notes:
- N/A
Starting ~3 weeks ago, `output` no longer contains the cursor marker,
cloud strips it on parsing. Instead, it should return a cursor offset.
Release Notes:
- Fixed moving the cursor to a predicted position in Zeta 2
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.
I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.
Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.
Self-Review Checklist:
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Release Notes:
- N/A
Self-Review Checklist:
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Closes #ISSUE
Release Notes:
- N/A or Added/Fixed/Improved ...
Self-Review Checklist:
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Closes #ISSUE
Release Notes:
- N/A or Added/Fixed/Improved ...
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.
Release Notes:
- N/A
---------
Co-authored-by: John Tur <john-tur@outlook.com>
This PR moves the `CompletionIntent` enum from the `cloud_llm_client`
crate to the `language_model` crate, as it is no longer part of the
Cloud interface.
Release Notes:
- N/A
## Context
Ensure subagent threads build requests with the Subagent intent instead
of UserPrompt. This allows us to properly
attribute this as a tool call for certain providers instead of a user
request.
## Self-Review Checklist
<!-- Check before requesting review: -->
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Release Notes:
- copilot_chat: Fix subagent requests being marked as user requests.
## Context
This PR adds some derives which make tracing easier on cloud side.
## Self-Review Checklist
<!-- Check before requesting review: -->
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Release Notes:
- N/A
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Co-authored-by: Oleksiy <oleksiy@zed.dev>
This will help with test times (in some cases), as nextest cannot figure
out whether a given rdep is actually an alive edge of the build graph
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A
This is a staff only toggle for now, since the consequences of
activating it are not obvious and quite dire (tokens costs 6 times
more).
Also, persist thinking, thinking effort and fast mode in DbThread so the
thinking mode toggle and thinking effort are persisted.
Release Notes:
- Agent: The thinking mode toggle and thinking effort are now persisted
when selecting a thread from history.
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Co-authored-by: Max <max@zed.dev>
Add StreamEnded variant so the client can distinguish between a stream
that the cloud ran to completion versus one that was interrupted (see
CLO-258). **That logic is to be added in a follow up PR**.
Add an Unknown fallback with #[serde(other)] for forward-compatible
deserialization of future variants.
The client advertises support via a new
x-zed-client-supports-stream-ended-request-completion-status header. The
server will only send the new variant if that header is passed. Both
StreamEnded and Unknown are silently ignored at the event mapping layer
(from_completion_request_status returns Ok(None)).
Part of CLO-264 and CLO-266; cloud-side changes to follow.
Release Notes:
- N/A
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR updates the model selector to highlight the latest models that
are available through the Zed provider:
<img width="388" height="477" alt="Screenshot 2026-02-06 at 1 46 41 PM"
src="https://github.com/user-attachments/assets/70760399-ecf6-46e3-80a7-cb998216c192"
/>
Closes CLO-205.
Release Notes:
- Added a "Latest" indicator to highlight the latest models available
through the Zed provider.
This allows us to switch the prompt format without client-side changes.
If we want to experiment with prompt formats or models other than the
currently-deployed one, we can use the raw endpoint, and do prompt
construction and output processing on the client.
This also adds an optional environment parameter to the raw endpoint, so
that we can use that endpoint in the new scheme where we're deploying to
separate environments for different zeta prompt versions.
Release Notes:
- N/A
This PR adds a new `supported_effort_levels` method to the
`LanguageModel` trait.
This can be used to retrieve the list of effort levels that the model
supports, which will eventually be used to drive the UI for selecting
the thinking effort.
Right now this list will only be populated for Cloud models.
Release Notes:
- N/A
This PR adds support for refreshing LLM tokens that are "outdated"—that
is, that are missing some required claims.
Release Notes:
- Fixed some instances of authentication errors with the Zed API that
could be resolved automatically by refreshing the token.
This PR removes the code for the legacy plans.
No more users will be on this plan as of January 17th, so it's fine to
land these changes now (as they won't be released until the 21st).
Closes CLO-76.
Release Notes:
- N/A
We currently attempt to flush all rejected predictions at once even if
we have accumulated more than
`MAX_EDIT_PREDICTION_REJECTIONS_PER_REQUEST`. Instead, we will now flush
as many as possible, and then keep the rest for the next batch.
Release Notes:
- N/A
Adds a `trigger` field to the zeta1/zeta2 prediction requests so that we
can distinguish between editor, diagnostic, and zeta-cli requests.
Release Notes:
- N/A
Many prediction requests end up being rejected early without ever being
set as the current prediction. Before this change, those cases weren’t
reported as rejections because the `request_prediction_with_*` functions
simply returned `Ok(None)`.
With this update, whenever we get a successful response from the
provider, we will return at least the `id`, allowing it to be properly
reported. The request now also includes a “reject reason,” since the
different variants carry distinct implications for prediction quality.
All of these scenarios are now covered by tests. While adding them, I
also found and fixed a bug where some cancelled predictions were
incorrectly being set as the current one.
Release Notes:
- N/A
---------
Co-authored-by: MrSubidubi <dev@bahn.sh>
We've realized that a lot of the logic within an
`EditPredictionProvider` is not specific to a particular edit prediction
model / service. Rather, it is just the generic state management
required to perform edit predictions at all in Zed. We want to move to a
setup where there's one "built-in" edit prediction provider in Zed,
which can be pointed at different edit prediction models. The only logic
that is different for different models is how we construct the prompt,
send the request, and parse the output.
This PR also changes the behavior of the staff-only `zeta2` feature flag
so that in only gates your *ability* to use Zeta2, but you can still use
your local settings file to choose between different edit prediction
models/services: zeta1, zeta2, and sweep.
This PR also makes zeta1's outcome reporting and prediction-rating
features work with all prediction models, not just zeta1.
To do:
* [x] remove duplicated logic around sending cloud requests between
zeta1 and zeta2
* [x] port the outcome reporting logic from zeta to zeta2.
* [x] get the "rate completions" modal working with all EP models
* [x] display edit prediction diff
* [x] show edit history events
* [x] remove the original `zeta` crate.
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
1. Introduce a common `PromptFormatter` trait
2. Let models define their generation params.
3. Add support for the experimental 1120-seedcoder prompt format
Release Notes:
- N/A
This prompt is for a fine-tuned model. It has the following changes,
compared to `minimal`:
- No instructions at all, except for one sentence at the beginning of
the prompt.
- Output is a simplified unified diff -- hunk headers have no line
counts (e.g., `@@ -20 +20 @@`)
- Qwen's FIM tokens are used where possible (`<|file_sep|>`,
`<|fim_prefix|>`, `<|fim_suffix|>`, etc.)
To evaluate this model:
```
ZED_ZETA2_MODEL=zeta2-exp [usual zeta-cli eval params ...] --prompt-format minimal-qwen
```
This will point to the most recent Baseten deployment of zeta2-exp
(which may change in the future, so the prompt-format may get out of
sync).
Release Notes:
- N/A
1. Add `--prompt-format=minimal` that matches single-sentence
instructions used in fine-tuned models (specifically, in `1028-*` and
`1029-*` models)
2. Use separate configs for agentic context search model and edit
prediction model. This is useful when running a fine-tuned EP model, but
we still want to run vanilla model for context retrieval.
3. `zeta2-exp` is a symlink to the same-named Baseten deployment. This
model can be redeployed and updated without having to update the
deployment id.
4. Print scores as a compact table
Release Notes:
- N/A
---------
Co-authored-by: Piotr Osiewicz <piotr@zed.dev>
I wanted to be able to work offline, so I made it a little bit more
convenient to point zeta2 at ollama.
* For zeta2, don't require that request ids be UUIDs
* Add an env var `ZED_ZETA2_OLLAMA` that sets the edit prediction URL
and model id to work w/ ollama.
Release Notes:
- N/A
This PR implements the `zeta-cli eval` command. It will:
- Run the edit prediction model if there are no cached results
- Compute precision/recall/F1 for context retrieval at the line level:
every retrieved line of context is counted as a true positive (correct
retrieval), false positive (retrieved something that was not expected),
or false negative (didn't retrieve an expected line)
- Compute similar metrics for edit predictions
- Pretty-print results, highlighting the difference between actual and
expected when printing to tty
Other changes:
- `zeta-cli predict` accepts a `--format` argument with options `md`,
`json`, `diff`
- Code restructure
Release Notes:
- N/A
---------
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
This PR adds a `zeta zeta2 predict` subcommand that takes an edit
prediction example markdown file as an argument, and performs zeta2's
prediction, showing the retrieved context and the predicted edit.
* [x] Apply uncommitted diff to get repo into the right state.
* [x] Apply edits in edit history
* [x] Display predicted edits as unified diff, regardless of model
output format
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Ben Kunkle <ben.kunkle@gmail.com>
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.
Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.
Release Notes:
- N/A